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REMARKS 

Claims 1-5 remain pending in the instant application. No amendments have been made. 
Applicants respond below to the specific rejections set forth in the Office Action mailed 
December 13, 2004. 

Rejection under 35 U.S.C. §101 - Utility 

The PTO has maintained the rejection of Claims 1-5 under 35 U.S.C. § 101 as lacking 
patentable utility. The PTO argues that the claimed invention is not supported by either a 
specific and substantial asserted utility or a well established utility for the reasons of record on 
pages 2 and 3 of the previous Office Action. 

Specifically, according to the PTO, Applicants' previous arguments regarding utility were 
not persuasive. The PTO argues that utility requires that the skilled artisan be able to use the 
claimed invention, and that the specification does not provide a specific and substantial or a well- 
established use. The PTO further argues that Applicants have provided only a single analysis 
without any relative range for basing a utility of underexpression. The PTO asserts that it "is not 
disclosed what type(s) of lung or stomach tumor was analyzed," and that it "is not clear if the 
findings can be generalized to all tumors from that tissue type." Also, the PTO argues that the 
skilled artisan would not know if the results were significant or under what conditions the 
difference in expression could be detected. 

The PTO makes various other arguments on pages 3 and 4 of the Final Office Action, 

including: 

[W]ithout knowing the range of variation there is insufficient guidance. If a clinician 
took a stomach tissue sample from a patient with a suspected stomach cancer, what is 
the likelihood that when compared with normal tissue, the level of nucleic acid of 
SEQ ID NO:77 from the patient would be lower? How many samples would be 
needed? What sensitivity would be needed? Would the normal tissue have to be a 
pooled sample or could it be from a single individual? 

The PTO further argues that "[t]he statement that the relative difference in expression is 
what is important is generally true, but without more specifics about necessary sample size, 
expression level range for normal and tumor tissues, types of stomach or lung tissue that can be 
used, and other questions, the specification has not provided the invention in a form readily 
usable by the skilled [artisan] such that significant further experimentation was unnecessary." 
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With regard to the correlation between nucleic acid expression and expression of the 
corresponding protein, the PTO also continues to argue that there is no evidence that the 
polypeptide of PR01357 is underexpressed in stomach tumors or lung tumors. In support the 
PTO argues that the data are insufficient because only relative expression data was presented. 
The PTO argues that there is no evidentiary support to the assertions by Dr. Polakis that it 
remains the central dogma in molecular biology that increased mRNA levels are predictive of 
corresponding increased levels of the encoded polypeptide. 

The PTO references Hu et al. (2003, Journal of Proteome Research 2:405-412) as refuting 
Applicants assertions regarding the correlation between nucleic acid expression and polypeptide 
expression. Also, the PTO attempts to refute the other Declarations and references submitted by 
Applicants by referencing papers by Haynes et al. (1998, Electrophoresis 19:1862-1871) and 
Konopka et al. (1986, PNAS 83:4049-4052). 

Applicants reiterate that the claimed invention has utility in diagnosing cancer, 
specifically in the diagnosis of stomach cancer and lung cancer. The specification in Example 18 
discloses data showing that the nucleic acid of SEQ ID NO:77 is more highly expressed in 
normal stomach tissue or normal lung tissue compared to stomach tumor or lung tumor, 
respectively. The data in Example 18 is more than sufficient to satisfy the correct utility standard 
under 35 U.S.C. § 101. Nonetheless, Applicants have provided various declarations by those of 
skill in the art, literature references, and textbook passages further supporting the claimed utility. 
The references cited by the PTO to refute the claimed utility do not actually support a lack of 
utility. In view of this and the discussion below, Applicants respectfully submit that the claimed 
antibodies have a credible, substantial, and specific utility. 

Utility need NOT be proven to an Absolute Certainty - a Correlation between the Evidence and 
the Asserted Utility is Sufficient 

Compliance with 35 U.S.C. § 101 is a question of fact. Raytheon v. Roper, 724 F.2d 951, 
956, 220 USPQ 592, 596 (Fed. Cir. 1983) cert, denied, 469 US 835 (1984). The evidentiary 
standard to be used throughout ex parte examination in setting forth a rejection is a 
preponderance of the evidence, or "more likely than not" standard. In re Oetiker, 977 F.2d 1443, 
1445, 24 USPQ2d 1443, 1444 (Fed. Cir. 1992). This is stated explicitly in the M.P.E.P.: 

■ 
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[T]he applicant does not have to provide evidence sufficient to establish that an 
asserted utility is true "beyond a reasonable doubt." Nor must the applicant 
provide evidence such that it establishes an asserted utility as a matter of 
statistical certainty. Instead, evidence will be sufficient if, considered as a whole, 
it leads a person of ordinary skill in the art to conclude that the asserted utility is 
more likely than not . M.P.E.P. at § 2107.02, part VII (2004) (emphasis in 
original, internal citations omitted). 

The PTO has the initial burden to offer evidence "that one of ordinary skill in the art 
would reasonably doubt the asserted utility." In re Brana, 51 F.3d 1560, 1566, 34 U.S.P.Q.2d 
1436 (Fed. Cir. 1995). Only then does the burden shift to the Applicant to provide rebuttal 
evidence. Id. As stated in the M.P.E.P., such rebuttal evidence does not need to absolutely prove 
that the asserted utility is real. Rather, the evidence only needs to be reasonably indicative of the 
asserted utility. 

In Fujikawa v. Wattanasin, 93 F.3d 1559, 39 U.S.P.Q.2d 1895 (Fed. Cir. 1996), the Court 

of Appeals for the Federal Circuit upheld a PTO decision that in vitro testing of a novel 

pharmaceutical compound was sufficient to establish practical utility, stating the following rule: 

[T]esting is often required to establish practical utility. But the test results need 
not absolutely prove that the compound is pharmacologically active. All that is 
required is that the tests be "reasonably indicative of the desired 
[pharmacological] response." In other words, there must be a sufficient 
correlation between the tests and an asserted pharmacological activity so as to 
convince those skilled in the art, to a reasonable probability, that the novel 
compound will exhibit the asserted pharmacological behavior." Fujikawa v. 
Wattanasin, 93 F.3d 1559, 1564, 39 U.S.P.Q.2d 1895 (Fed. Cir. 1996) (internal 
citations omitted, bold emphasis added, italics in original). 

While the Fujikawa case was in the context of utility for pharmaceutical compounds, the 
principles stated by the Court are applicable in the instant case where the asserted utility is for a 
diagnostic use - utility does not have to be established to an absolute certainty, rather, the 
evidence must convince a person of skill in the art "to a reasonable probability." In addition, the 
evidence need not be direct, so long as there is a "sufficient correlation" between the tests 
performed and the asserted utility. 

Here, as discussed more fully below, the evidence of record, including the declarations by 
those of skill in the art and the teachings of the specification, is convincing to a person of skill in 
the art "to a reasonably probability." 
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The Court in Fujikawa relied in part on its decision in Cross v. Hzuka, 753 F.2d 1040, 
224 U.S.P.Q. 739 (Fed. Cir. 1985). In Cross, the Appellant argued that basic in vitro tests 
conducted in cellular fractions did not establish a practical utility for the claimed compounds. 
Appellant argued that more sophisticated in vitro tests using intact cells, or in vivo tests, were 
necessary to establish a practical utility. The Court in Cross rejected this argument, instead 
favoring the argument of the Appellee: 

[I\n vitro results... are generally predictive of in vivo test results, i.e., there is a 
reasonable correlation therebetween. Were this not so, the testing procedures of 
the pharmaceutical industry would not be as they are. [Appellee] has not urged, 
and rightly so, that there is an invariable exact correlation between in vitro test 
results and in vivo test results. Rather, [Appellee's] position is that successful in 
vitro testing for a particular pharmacological activity establishes a significant 
probability that in vivo testing for this particular pharmacological activity will be 
successful. Cross v. Ilzuka, 753 F.2d 1040, 1050, 224 U.S.P.Q. 739 (Fed. Cir. 
1985) (emphasis added). 

The Cross case is very similar to the present case. Here, no additional sophisticated 
testing or further research is required to establish the utility of the nucleic acids, the encoded 
polypeptides or the claimed antibodies in cancer diagnostics. As with in vitro testing in the 
pharmaceutical industry, those of ordinary skill in the art recognize to a reasonable probability 
that a showing of differential expression of mRNA in cancerous cells compared to normal cells 
indicates a real world utility in cancer diagnostics for the nucleic acids, their encoded 
polypeptides and the antibodies to the polypeptides. One of ordinary skill in the art would rely 
upon the differential expression data in Example 1 8 as reasonably indicating a real world use for 
the nucleic acids, the encoded polypeptides and the claimed antibodies. Those of ordinary skill 
in the art recognize a reasonable correlation between differential expression in cancerous versus 
non-cancerous cells and utility in distinguishing between those cells. 

Also, as in Cross, Applicants here do not argue that there is "an invariable exact 
correlation" between differential expression and diagnostic markers. Instead, Applicants' 
position detailed below is that the data in Example 18 are reliable and significant, as well as more 
than sufficient to establish a "significant probability" that the differential expression of the 
nucleic acids and PR01357 polypeptides in cancerous versus non-cancerous tissue provides 
diagnostic utility for the same, as well as antibodies to the same, based on "a reasonable 
correlation therebetween." In order to satisfy the proper utility standard, no further research or 
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testing is required. One of skill in the art more likely than not would recognize utility for the 
claimed antibodies based upon the differential expression data in Example 18. 

Also, those of skill in the field of biotechnology rely on the reasonable correlation that 
exists between gene expression and protein expression (see below). Were there no reasonable 
correlation between the two, the techniques that measure gene levels such as microarray analysis, 
differential display, and quantitative PCR would not be so widely used by those in the art. As in 
Cross, Applicants here do not argue that there is "an invariable exact correlation" between gene 
expression and protein expression. Instead, Applicants' position detailed below is that a 
measured change in gene expression in cancer cells establishes a "significant probability" that the 
expression of the encoded polypeptide in cancer will also be changed based on "a reasonable 
correlation therebetween." 

Even assuming that the PTO has met its initial burden to offer evidence that one of 
ordinary skill in the art would reasonably doubt the truth of the asserted utility, Applicants assert 
that they have met their burden of providing rebuttal evidence such that it is more likely than not 
those skilled in the art, to a reasonable probability, would believe that the PR01357 polypeptide 
and antibodies thereto are useful as diagnostic tools for stomach and lung cancer. Applicants 
further address and rebut the specific points raised by the Examiner in the Final Office Action. 

The Differential Expression of the PRQ1357 mRNA con fers Utility upon the Nucleic Acids, the 
PRO 13 57 Polypeptide, and the Claimed Antibodies 

Applicants submit that the evidence of record demonstrates a substantial and specific 
utility for the claimed antibodies. No additional testing is required to establish to "a reasonable 
probability" the utility of the claimed antibodies in cancer diagnostics. Those of skill in the art 
recognize a "real world" utility based upon the data set forth in the application as filed. As set 
forth in Example 18, nucleic acids encoding the PR01357 polypeptide are more highly expressed 
in normal stomach tissue or normal lung tissue compared to stomach tumor or lung tumor, 
respectively. Thus, one of skill in the art would recognize that it is more likely than not that the 
polypeptide is more highly expressed in normal stomach tissue or normal lung tissue compared 
to stomach tumor or lung tumor, respectively. It is this differential expression of the nucleic 
acids and the polypeptides in the respective cancer and non-cancerous cells that makes the 
nucleic acids, the polypeptides and the antibodies to the polypeptides useful in the diagnosis of 
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cancer, as it serves as the basis for detecting a difference between cancerous and non-cancerous 
cells. 

Nonetheless, the PTO continues to argue that the data in Example 1 8 are insufficient and 
do not provide an immediate use for the claimed subject matter. Applicants strongly disagree. 
The MPEP states: 

Office personnel must be careful not to interpret the phrase 'immediate benefit to the 
public' or similar formulations in other cases to mean that products or services based 
on the claimed invention must be 'currently available' to the public in order to satisfy 
the utility requirement. See, e.g., Brenner v. Manson, 383 U.S. 519, 534-35, 148 
USPQ 689, 695 (1966). Rather, any reasonable use that an applicant has identified 
for the invention that can be viewed as providing a public benefit should be accepted 
as sufficient, at least with regard to defining a 'substantial' utility. Courts have 
repeatedly found that the mere identification of a pharmacological activity of a 
compound that is relevant to an asserted pharmacological use provides an 'immediate 
benefit to the public' and thus satisfies the utility requirement. See Nelson v. Bowler 
626 F.2d 853, 856, 206 USPQ 881, 883 (CCPA 1980). 

M.P.E.P. at 2107.01 (Emphasis in original). 

The data in Example 1 8 are sufficient to provide an "immediate benefit to the public" or a 
"real world" use. Applicants have identified molecules that can be used as cancer markers. This 
knowledge provides an immediate public benefit. Under § 101 the utility does not have to be an 
FDA approved use, one that is ready for commercial sale, or even one that is ready for clinical 
use. Use as a cancer marker based upon the differential expression data in Example 18 are 
sufficient for utility under § 101. The data show that the nucleic acids encoding the PR01357 
polypeptide are differentially expressed in stomach and lung tissue compared to cancerous 
stomach and lung tissue, respectively. Example 18 explains that standard techniques and 
controls were utilized. Specifically, the widely accepted technique of PCR was used to 
determine "whether the polynucleotides tested were more highly expressed, less expressed, or 
whether expression remained the same in tumor tissue as compared to its normal counterpart." 
Furthermore, the first Grimaldi Declaration explains the techniques and protocols, stating that the 
gene expression studies reported in Example 1 8 of the instant application were performed using 
pooled samples of normal and of tumor tissues. With regard to reliability, Mr. Grimaldi explains 
that: 

The DNA libraries used in the gene expression studies were made from pooled 
samples of normal and of tumor tissues. Data from pooled samples is more likely 
to be accurate than data obtained from a sample from a single individual. That is, 
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the detection of variations in gene expression is likely to represent a more 
generally relevant condition when pooled samples from normal tissues are 
compared with pooled samples from tumors in the same tissue type. 
(Paragraph 5). 

Furthermore, the Declaration of Grimaldi explained the significance of the results and 
that the methodology utilized in Example 18 indicates that the difference in expression levels 
between normal and cancerous cells is at least two fold. In paragraphs 6 and 7, Mr. Grimaldi 
explains that the semi-quantitative analysis employed to generate the data of Example 1 8 is 
sufficient to determine if a gene is over- or underexpressed in tumor cells compared to 
corresponding normal tissue. "Because this technique relies on the visual detection of ethidium 
bromide staining of PCR products on agarose gels, it is reasonable to assume that any detectable 
differences seen between two samples will represent at least a two fold difference in cDNA." He 
also states that the results of the gene expression studies indicate that the genes of interest "can 
be used to differentiate tumor from normal." He explains that "[t]he precise levels of gene 
expression are irrelevant; what matters is that there is a relative difference in expression between 
normal tissue and tumor tissue." (Paragraph 7). Thus, since it is the relative level of expression 
between normal tissue and suspected cancerous tissue that is important, the precise level of 
expression in normal tissue is irrelevant. Likewise, there is no need for additional quantitative 
data to compare the level of expression in normal and tumor tissue. As Mr. Grimaldi states, "If a 
difference is detected, this indicates that the gene and its corresponding polypeptide and 
antibodies against the polypeptide are useful for diagnostic purposes, to screen samples to 
differentiate between normal and tumor (emphasis added)." Thus, there is guidance as to the 
levels of expression, particularly, that at a minimum, there is at least a two fold difference 
between the expression levels in cancerous cells compared to the corresponding non-cancerous 
cells. Therefore, contrary to the conclusions in the Final Office Action, Applicants have 
established that one of skill in the relevant art would recognize that there is a real world 
significance to the differential expression data set forth in the specification, and that a reasonable 
correlation exists between the data and the claimed antibodies in diagnostics. 

Despite the presented data, the various declarations and the submitted literature 
references in support, many of the arguments made by the PTO attempt to refute the operability 
of the claimed invention. 

Applicants remind the PTO that: 
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A small degree of utility is sufficient . . . The claimed invention must only be capable 
of performing some beneficial function ... An invention does not lack utility merely 
because the particular embodiment disclosed in the patent lacks perfection or 
performs crudely ... A commercially successful product is not required . . . Nor is it 
essential that the invention accomplish all its intended functions ... or operate under 
all conditions . . . partial success being sufficient to demonstrate patentable utility . . . 
In short, the defense of non-utility cannot be sustained without proof of total 
incapacity. See E.I. du Pont De Nemours and Co. v. Berkley and Co., 620 F.2d 1247, 
1260 n.17, 205 USPQ 1, 10 n.17 (8th Cir. 1980). If an invention is only partially 
successful in achieving a useful result, a rejection of the claimed invention as a whole 
based on a lack of utility is not appropriate. See In re Brana, 51 F.3d 1560, 34 
USPQ2d 1436 (Fed. Cir. 1995); In re Gardner, 475 F.2d 1389, 177 USPQ 396 
(CCPA), reh'g denied, 480 F.2d 879 (CCPA 1973); In re Marzocchi, 439 F.2d 220, 
169 USPQ 367 (CCPA 1971). 

M.P.E.P. at 2107.01 (emphasis in original). 

Here, the claimed subject matter can be used by those of skill in the art to differentiate 
cancerous versus non-cancerous stomach and lung cells. Thus, the claimed subject matter is 
operative and the various questions raised by the PTO do not defeat the utility under § 101 . 

For these reasons, Applicants submit that one of ordinary skill in the art would reasonably 
recognize the utility of the claimed antibodies. 

Applicants have Established that the Accepted Understanding in the Art is that there is a Direct 
Correlation between Comparative mRNA Levels and the Level of Expression of the Encoded 

■ 

Protein in Normal versus Cancerous Tissue 

Applicants maintain that it is more likely than not that the encoded polypeptide is more 
highly expressed in normal stomach tissue or normal lung tissue compared to stomach tumor or 
lung tumor tissue respectively. Applicants reiterate that the evidence of record establishes a 
correlation between mRNA expression and protein expression. As stated above, the standard for 
utility is not absolute certainty, but rather whether one of skill in the art would be more likely 
than not to believe the asserted utility. The working hypothesis among those skilled in the art is 
that there is a direct correlation between mRNA levels and protein levels. Despite some 
teachings in the art of certain genes that do not fit within this paradigm, which are exceptions 
rather than the rule, in the vast majority of cases , the combined teachings in the art, exemplified 
by the previously submitted papers, for example by Orntoft et al., Hyman et al. and Pollack et al. 
and the previously submitted Grimaldi and Polakis declarations, overwhelmingly teach that gene 
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expression influences mRNA expression and protein levels. Thus, one of skill in the art would 
reasonably expect, in this instance, based on the gene expression data for the PR01357 gene, that 
the PR01357 protein is concomitantly over-expressed in normal stomach tissue compared to 
stomach tumor tissue, and in normal lung tissue compared to lung tumor tissue. 

The statements of Grimaldi and Polakis are further supported by the teachings of two 
leading textbooks. Molecular Biology of the Cell is a leading cell biology textbook in the field 
(Bruce Alberts, et al., Molecular Biology of the Cell (4 th ed. 2002), submitted herewith as Exhibit 
1). Figure 6-3 on page 302 illustrates the basic principle that there is a correlation between 
increased gene expression and increased protein expression. The accompanying text states that 
"a cell can change (or regulate) the expression of each of its genes according to the needs of the 
moment - most obviously by controlling the production of its mRNA." Molecular Biology of the 
Cell at 302, emphasis added. Similarly, figure 6-90 on page 364 illustrates the path from gene to 
protein. The accompanying text states that while potentially each step can be regulated by the 
cell, "the initiation of transcription is the most common point for a cell to regulate the expression 
of each of its genes." Molecular Biology of the Cell at 364. This point is repeated on page 379, 
where the authors state that of all the possible points for regulating protein expression, "[f]or 
most genes transcriptional controls are paramount." Molecular Biology of the Cell at 379. 

Also, support for Applicants' position can be found in the Lewin textbook (Genes VI 
(1997) CH 29, pp. 847-848; submitted herewith as Exhibit 2) which also states that "having 
acknowledged that control of gene expression can occur at multiple stages, and that production of 
RNA cannot inevitably be equated with production of protein, it is clear that the overwhelming 
majority of regulatory events occur at the initiation of transcription " (emphasis added). 

Still more support is found in Zhigang et al, World Journal of Surgical Oncology 2:13, 
2004, submitted herewith as Exhibit 3. Zhigang studied the expression of prostate stem cell 
antigen (PSCA) protein and mRNA to validate it as a potential molecular target for diagnosis and 
treatment of human prostate cancer. The data showed "a high degree of correlation between 
PSCA protein and mRNA expression" (see page 4 of Exhibit 3). Of the samples tested, 81 out of 
87 showed a high degree of correlation between mRNA expression and protein expression. The 
authors conclude that "it is demonstrated that PSCA protein and mRNA overexpressed in human 
prostate cancer, and that the increased protein level of PSCA was resulted from the upregulated 
transcription of its mRNA." Exhibit 3, page 6. Even though the correlation between mRNA 
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expression and protein expression occurred in 93% of the samples tested, not 100%, the authors 
state that "PSCA may be a promising molecular marker for the clinical prognosis of human Pea 
and a valuable target for diagnosis and therapy of this tumor." Exhibit 3, page 7. 

Further, Meric et al, Molecular Cancer Therapeutics, vol. 1, 971-979 (2002), submitted 
herewith as Exhibit 4, states the following: 

The fundamental principle of molecular therapeutics in cancer is to exploit the 
differences in gene expression between cancer cells and normal cells... [M]ost 
efforts have concentrated on identifying differences in gene expression at the level 
of mRNA, which can be attributable to either DNA amplification or to differences 
in transcription. Meric et al. at 971 (emphasis added). 

This statement provides additional support for Applicants' asserted utility. It is true that there is 
no necessary correlation between gene expression and protein expression because there are other 
mechanisms for regulating gene expression. However, were there no significant correlation 
between gene expression and protein levels, exploiting differences in gene expression between 
cancer cells and normal cells would not be a "fundamental principle of molecular therapeutics in 
cancer." Moreover, as mentioned above, Applicants need not establish a necessary connection 
between gene expression and protein expression. Rather, there need only be a reasonable 
correlation between the evidence offered and the asserted utility such that it is more likely than 
not that a person of skill in the art would be convinced, to a reasonable probability, that the 
asserted utility is true. 

The PTO relies on Hu et al. (J. Proteome Res., 2(4):405-12 (2003)) to support its 
assertion that the literature cautions researchers from drawing conclusions based on small 
changes in transcript expression levels between normal and cancerous tissue. Applicants 
respectfully submit that Hu does not satisfy the PTO's burden to offer evidence that one of 
ordinary skill in the art would reasonably doubt the truth of the asserted utility. 

In Hu, the researchers used an automated literature-mining tool to summarize and 

estimate the relative strengths of all human gene-disease relationships published on Medline. 

They then generated a microarray expression dataset comparing breast cancer and normal breast 

tissue. Using their data-mining tool, they looked for a correlation between the strength of the 

literature association between the gene and breast cancer, and the magnitude of the difference in 

expression level. They report that for genes displaying a 5-fold change or less in tumors 

compared to normal, there was no evidence of a correlation between altered gene expression and 
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a known role in the disease. See Hu at 411. However, among genes with a 10- fold or more 
change in expression level, there was a strong correlation between expression level and a 
published role in the disease. See id. at 412. Importantly, Hu reports that the observed 
correlation was only found among estrogen receptor-positive tumors, not ER-negative tumors. 
See id. 

The general findings of Hu are not surprising - one would expect that genes that have the 
greatest change in expression in a disease would be the first targets of research, and therefore 
have the strongest known relationship to the disease as measured by the number of reports of a 
connection in the literature. But this does not mean that genes, and their corresponding proteins, 
with a lower level of change in expression are not important or cannot be used as molecular 
markers of the disease. This is demonstrated by the fact that ER-negative tumors did not show a 
correlation. The correlation reported in Hu only indicates that the greater the change in 
expression level, the more likely it is that there is a published or known role for the gene in the 
disease, as found by their automated literature-mining software. Nowhere in Hu does it say that a 
lack of correlation in their study means that the genes, and their corresponding proteins, with a 
less than five-fold change in level of expression in cancer cannot serve as a molecular marker of 
cancer. Genes with lower levels of change in expression may or may not be the most important 
genes in causing the disease, but the genes and their corresponding proteins can still show a 
consistent and measurable change in expression. While such genes and polypeptides may or may 
not be good targets for further research, they can nonetheless be used as diagnostic tools. Thus, 
Hu does not refute the Applicants' assertion that the PR01357 polypeptide, and its encoding 
nucleic acid, can be used as a cancer diagnostic tool because they are differentially expressed in 
certain tumors. 

The PTO also relies on Haynes et al. (1998, Electrophoresis 19:1862-1871) and Konopka 
et al. (1986, PNAS 83:4049-4052) to attempt to refute the general rule that there is a correlation 
between nucleic acid expression and protein expression. The references do not support the 
PTO's position or refute the utility of the claims. 

Haynes is a review article dealing with the art of proteome analysis. The assertions in 
Haynes cited by the Examiner were made in an effort to identify shortcomings in the art of 
mRNA quantification to argue for "proteome analysis to become an essential component in the 
comprehensive analysis of biological systems." Haynes, p. 1863. Haynes studied 80 selected 
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samples from Saccharomyces cerevisiae, and reported "a general trend but no strong correlation 
between protein and transcript levels (Fig. 1)." Id. However, a cursory inspection of Fig. 1 
shows a clear correlation between the mRNA levels and protein levels measured. This 
correlation is confirmed by an inspection of the full-length research paper from which the data in 
Fig. 1 were derived, presented herein as Exhibit 5 (Gygi et al., Molecular and Cellular Biology, 
Mar. 1999, 1720-1730). Gygi states that "there was a general trend of increased protein levels 
resulting from increased mRNA levels," with a correlation coefficient of 0.935, indicating a 
strong correlation. Gygi, p. 1726. Moreover, Gygi also states that the correlation is especially 
strong for highly expressed mRNAs. Id. Considering that Example 18 of the specification 
shows higher expression of PRO 1357 mRNA in normal stomach and normal lung tissue as 
compared to stomach tumor and lung tumor, Haynes and Gygi actually provide strong evidence 
in support of a general correlation between mRNA and protein levels, and thus further support 
the utility of the PRO 1357 polypeptides and the claimed antibodies to the same. 

The 50-fold variation referred to by Haynes and cited by the Examiner, does not in any 
way show the absence of a correlation between mRNA and protein levels, but rather identifies 
the outer limits of variability in the authors' experiments. This variability may support the 
authors' assertion that the amount of a particular protein cannot accurately predict the particular 
level of the corresponding mRNA transcript, but it does not suggest an absence of a general 
correlation between mRNA and protein levels. Again, Applicants' utility is based on the 
differential expression of mRNA in normal stomach and lung versus stomach tumor and lung 
tumor. Exact levels of expression are irrelevant. Moreover, Gygi states that the high degree of 
variability seen at low levels of mRNA (shown in inset of Fig. 1, Haynes p. 1863) is due to the 
fact that "the magnitude of the error in the measurement of mRNA levels is inversely 
proportional to the mRNA levels." Gygi, p. 1727. Considering that PR01357 mRNA has been 
shown in Example 18 of the specification to be more highly expressed in normal stomach tissue 
than stomach tumor, and in normal lung tissue than in lung tumor, the variability identified by 
Haynes is even less applicable to establishing the absence of a correlation between mRNA and 
protein levels in the instant case. 

As stated above, the standard for utility is not absolute certainty, but rather whether one 
of skill in the art would be more likely than not to believe the asserted utility. Here, the utility 
requirement does not require Applicants to show that mRNA levels correlate to protein levels in 
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every case, but rather only that the correlation exists more often than not. The data presented in 
Haynes is not inconsistent with or contradictory to the utility or enablement of the instant claims. 
To the contrary, the data clearly show a general correlation between protein levels and mRNA 

> 

levels, and thus support Applicants' assertion that such a general correlation exists. 

Even if Haynes supported the Examiner's argument, which it does not, one contrary 
example does not establish that one of skill in the art would find it is more likely than not there is 
no general correlation between mRNA level and protein levels. In fact, the working hypothesis 
among those skilled in the art, as illustrated by the evidence presented above by Applicants, is 
that there is a direct correlation between mRNA levels and protein levels. This is further 
supported by the statement in Haynes that "interpretations of quantitative mRNA expression 
profiles frequently implicitly or explicitly assume that for specific genes the transcript levels are 
indicative of the levels of protein expression." See, Haynes, p. 1863, first full paragraph. 
Haynes does not suggest there is no correlation between mRNA and protein levels, but rather 
points to what the authors believe are shortcomings of using mRNA quantification to predict 
protein levels; specifically, that mRNA levels may not accurately predict protein levels in each 
particular instance. Considering the more likely than not standard for utility, Haynes' 
identification of reasons why proteomic analysis may be preferable in some cases does not 
contradict Applicants' evidence that there is a general correlation between mRNA and protein 
levels. 

The PTO cites Konopka in further support of the assertion that it is not the norm that 
levels of mRNA correlate with corresponding protein levels. The PTO has confused the 
relationship between an increase in copy number of a gene and the level of mRNA on the one 
hand, with the relationship between mRNA expression and levels of the corresponding protein on 
the other. In particular, the PTO cites the statement in Konopka that "protein expression is not 
related to amplification of the abl gene but to variation in the level of the bcr-abl mRNA 
produced from a single Phi template." The results presented in Konopka actually present strong 
evidence in support of Applicants' position that there is a general understanding in the art that 
levels of mRNA correlate with levels of the corresponding proteins. Konopka analyzed the 
expression patterns of a gene associated with certain cancers. The authors show a wide variation 
in the levels of the protein in various cell types, and find that this variation can be attributed to 
the levels of the corresponding mRNA in each cell type. See, Konopka at 4050. Konopka thus 
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concludes, "these combined data suggest that differential bcr-abl mRNA expression from a single 
gene template is responsible for the variable levels of P210 c ' abl [the protein of interest] detected." 
Id. at. 4051. Thus, far from supporting the PTO's assertion that it is not the norm that increased 
transcription leads to increased levels of the corresponding protein, Konopka strongly supports 
the opposite proposition asserted by Applicants - that the level of mRNA, more often than not, 
correlates with the level of the corresponding protein. 

Thus, the references cited by the PTO as evidence that no such general understanding 
exists do not in any way support the PTO's arguments. To the contrary, Haynes and Konopka 
each provide strong evidence of the correlation asserted by Applicants. Accordingly, Applicants 
respectfully submit that the totality of the evidence clearly supports the conclusion that one of 
skill in the art understands that, more likely than not, levels of mRNA directly correlate with 
levels of corresponding proteins. 

Together, the declarations of Grimaldi and Polakis, the accompanying references, as well 
as the other submitted excerpts and literature references, all establish that the accepted 
understanding in the art is that there is a reasonable correlation between changes in gene 
expression and the level of the encoded protein. Despite some teachings in the art of certain 
genes that do not fit within this paradigm which are exceptions rather than the rule, in the vast 

* 

majority of cases , the combined teachings in the art, exemplified by cited references and the 
Grimaldi and Polakis declarations, overwhelmingly teach that gene expression influences mRNA 
expression and protein levels. Considering that utility does not have to be proven to an absolute 
certainty, Applicants submit they have provided sufficient evidence to show a reasonable 
correlation between mRNA expression and the level of PRO 13 57 protein. In light of the lack of 
support for any argument by the PTO to the contrary, Applicants submit that they have 
established that it is more likely than not that one of skill in the art would believe that because 
the PRO 1357 mRNA is more highly expressed in normal stomach tissue and normal lung tissue 
compared to stomach tumor tissue and lung tumor tissue, respectively, the PRO 1357 polypeptide 
will also be more highly expressed in normal stomach tissue and normal lung tissue compared to 
stomach tumor tissue and lung tumor tissue, respectively. One of skill in the art would recognize 
that a nucleic acid or polypeptide which is differentially expressed in certain cancer cells 
compared to the corresponding normal tissue would have utility as a diagnostic tool to screen 
between normal and tumor tissue samples. Thus, Applicants submit that they have established 

-16- 



i 

•V 

Appl.No. : 10/063,587 

Filed : May 3, 2002 

that it is more likely than not that one of skill in the art would recognize the asserted utility of the 
differentially expressed nucleic acids, the PR01357 polypeptides, and the claimed antibodies as 
diagnostic tools for both stomach and lung tumors. 

Conclusion 

As set forth above, Applicants have established that the ordinary skilled artisan 
recognizes the sufficiency of the correlation between the differential expression data in Example 
18 and use in distinguishing between cell types (cancerous versus non-cancerous) so as to 
convince the skilled artisan to a reasonable probability that the claimed antibodies are useful as a 
screening tool in cancer diagnostics. Applicants have rebutted all of the arguments set forth in 
the Final Office Action regarding this matter. 

Also, Applicants have provided additional evidence in support of the correlation between 
DNA expression and protein expression. 

Thus, given the totality of the evidence provided, Applicants submit that they have 
established a substantial, specific, and credible utility for the nucleic acids, the encoded 
PR01357 polypeptides, and the claimed antibodies as diagnostic agents. According to the PTO 
Utility Examination Guidelines (2001), irrefutable proof of a claimed utility is not required. 
Rather, a specific, substantial, and credible utility requires only a "reasonable" confirmation of a 
real world context of use. Applicants submit that they have established that it is more likely than 
not that one of skill in the art would reasonably accept the utility for the claimed antibodies set 
forth in the specification. Applicants believe that they have met their burden of establishing a 
specific and substantial credible utility for the claimed invention. 

In view of the above, Applicants respectfully request that the PTO reconsider and 
withdraw the utility rejection under 35U.S.C. §101. 

Rejection under 35 U.S.C. SI 12, first paragraph - Enablement 

The PTO has maintained the rejection of Claims 1-5 as lacking enablement under 35 
U.S.C. § 112, first paragraph. According to the Examiner, because the claimed invention is not 
supported by either a substantial asserted utility or a well established utility, one of skill in the art 
would not know how to use the invention. 
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Applicants believe that the evidence, declarations, references, and arguments discussed 
above make clear that Applicants have established that one of skill in the art would be convinced, 
to a reasonable probability, that the PRO 1357 polypeptides are overexpressed in normal stomach 
tissue and normal lung tissue, and that therefore antibodies to those polypeptides have utility as 
diagnostic tools for screening tissue to detect stomach and lung tumors. Diagnostic and 
therapeutic antibodies can be created that bind to the encoded PRO 1357 polypeptides. This is 
disclosed in the application, and the techniques for the creation of antibodies are well known and 
routine in the art. Thus, at least one use of PRO 135 7 nucleic acids, the encoded polypeptides and 
the claimed antibodies is adequately enabled, which is all that is required - "if any use is enabled 
when multiple uses are disclosed, the application is enabling for the claimed invention." 
M.P.E.P. 2164.01(c). In view of the above, Applicants respectfully request that the Examiner 

* 

reconsider and withdraw the enablement rejection under 35 U.S.C. § 1 12, first paragraph. 

Rejection under 35 U.S.C. S102(b) - Anticipation 

Claims 1-5 remain rejected under 35 U.S.C. § 102(b) as being anticipated by 
WO 01/16318 and WO 00/12708. 

The data in Example 18 were disclosed in priority application, PCT/US00/23328 filed 
August 24, 2000, which is the PCT application published as WO 01/16318. As discussed above, 
the instant claimed subject matter has utility based upon the data in Example 18 and the instant 
application is a continuation of PCT/US00/23328; therefore, the present claims are entitled to the 
filing date of August 24, 2000. WO 01/163 18 is not prior art under § 102(b). 

WO 00/12708 was published on March 9, 2000, which is less than one year before the 
filing of priority application PCT/USOO/23328 (August 24, 2000). Again, PCT/US00/23328 
discloses the differential expression data which provides utility for the instant claims, and 
Applicants are entitled to the filing date of August 24, 2000. Therefore, WO 00/12708 cannot be 
cited under § 102(b). 

In view of the above discussion, reconsideration and withdrawal of the rejection under 
§ 1 02(b) is respectfully requested. 



18 



I 



Appl.No. : 10/063,587 

Filed : May 3, 2002 

V CONCLUSION 

In view of the above, Applicants respectfully maintain that claims are patentable and 
request that they be passed to issue. Applicants invite the Examiner to call the undersigned if any 
remaining issues may be resolved by telephone. 

Please charge any additional fees, including any fees for additional extension of time, or 
credit overpayment to Deposit Account No. 1 1-1410. 

Respectfully submitted, 

KNOBBE, MARTENS, OLSON & BEAR, LLP 



Dated: 




By: 
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FROM DNATO RNA 

Transcription and translation are die means by which cells read out, or express, 
the genetic instructions in their genes. Because many identical RNA copies can 
be made from the same gene, and each HNA molecule can direct the synthesis 
of many identical protein molecules, cells can synthesize a large amount of 
protein rapidly when necessary. But each gene can also be transcribed and 
translated with a different efficiency, allowing the cell to make vast quantities of 
some proteins and tiny quantities of others (Figure 6-3). Moreover, as we see in 
the next chapter, a cell can change (or regulate) the expression of each of its 
genes according to the needs of the moment— most obviously by controlling 
the production of its RNA. 



Figure 6-3 Genes can be expressed 
with different efficiencies. Gene A I* 
transcribed and .translated much mare 
efficiently than gene R.Thfa allows the 
amount of protein A to the cell to be 
much greater than that of protein B. 



Portions of DNA Sequence Are Transcribed into RNA 

The first step a cell takes in reading out a needed part of its genetic instructions 
is to copy a particular portion of its DNA nucleotide sequence — a gene— into an 
HNA nucleotide sequence. The information in HNA, although copied into another 
chemical form, is still written in essentially the same language as it is in DNA — 
the language of a nucleotide sequence. Hence the name transcription. 

like DNA, RNA is a linear polymer made of four different types of nucleotide 
subunits linked together by phosphodiester bonds (Figure 6-4). It differs from 
DNA chemically in two* respects; (1) the nucleotides in HNA are 
ribomtcleotides-^that is, they contain the sugar ribose (hence the name ribonu- 
cleic acid) rather than deojeyribose; (2) although, like DNA, RNA contains die 
bases adenine CA)» guanine (G), and cytosine (Q, it contains the base uracil (U) 
instead of the thymine (T) in DNA. Since U f like T, can base-pair by hydrogen- 
bonding with A (Figure 6-5), the complementary base-pairing properties 
described for DNA in Chapters 4 and 5 apply also to RNA (in RNA, G pairs with 
C, and A pairs with U). It is not uncommon, however, to find other types of base 
pairs in RNA: for example, 6 pairing with U occasionally. 

Despite these small chemical differences, DNA and RNA differ quite dra- 
matically in overall structure. Whereas DNA always occurs in cells as a double- 
stranded helix, RNA is single-stranded. RNA chains therefore fold up into a 
variety of shapes, just as a polypeptide chain folds up to form the final shape of 
a protein (Figure 6-6) . As we see later in this chapter, the ability to fold into com- 
plex three-dimensional shapes allows some RNA molecules to have structural 
and catalytic functions. 



Transcription Produces RNA Complementary to 
One Strand of DNA 

All of the RNA in a cell is made by DNA transcription, a process that has cer- 
tain similarities to the process of DNA replication discussed in Chapter 5. 
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Figure 6-89 Protein aggregates that cause human disease. (A) Schematic illustration of the type of 
conformational change in & protein that produces material for a cross-beta filament. (B) Diagram illustrating 
the self-Infectious nature of the protein aggregation that Is central to prion diseases. PrP Is highly unusual 
because the misfolded version of the protein* called PrP* Induces the norma! PrP protein it contacts* to 
change Its conformation, as shown. Most of the human diseases caused by protein aggregation are caused by 
the overproduction of a variant protein that Is especially prone to aggregation, but because this structure Is 
not Infectious In this way, it cannot spread from one animal to another. (Q Drawing of a cross-beta filament 
a common type of protease-reslstant protein aggregate found in a variety of human neurological diseases. 
Because the hydrogen-bond interactions in a p sheet form between polypeptide backbone atoms (see Figure 
3-9), a number of different abnormally folded proteins can produce this structure. (D) One of several 
possible models for the conversion of PrP to PrP*, showing the likely change of two 0>heHces into four 
(J-stj-ands. Although the structure of the norma! protein has been determined accurately, the structure of the 
infectious form is not yet known with certainty because the aggregation has prevented die use of standard 
structural techniques. (C, courtesy of Louise SerpeH adapted from M. Surtde et ai,JL Mo/. Biol 273:729-739, 
1 997; D, adapted from S.B. Prusiner, Trends Bfocfcem. Sd. 21:482-487, 1996.) 

animals and humans, it can be dangerous to eat the tissues of animals that con- 
tain PrP*, as witnessed most recently by the spread of BSE (commonly referred 
to as fee "mad cow disease") from cattle to humans in Great Britain. 

Fortunately, in the absence of PrP*, PrP is extraordinarily difficult to convert 
to its abnormal form* Although very few proteins have the potential to misfold 
into an Infectious conformation, a similar transformation has been discovered 
to be the cause of an otherwise mysterious "protein-only inheritance" observed 
in yeast cells. 

There Are Many Steps From DNA to Protein 

We have seen so far in this chapter that many different types of chemical reac- 
tions are required to produce a property folded protein from the information 
contained in a gene (Figure 6-90). The final level of a properly folded protein in 
a cell therefore depends upon the efficiency with which each of the many steps 
is performed. 

We discuss in Chapter 7 that cells have the ability to change die levels of 
their proteins according to their needs, hi principle, any or all of the steps in Fig- 
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Rgure 6-90 The production of a 
protein by a oucaryotlc cell. The final 
level of each protein In a eucaryotlc cell 
depends upon the efficiency of each step 
depicted. 



ure 6-80) could be regulated by the cell for each individual protein. However, as 
we shall see in Chapter 7, the initiation of transcription is the most common 
point for a cell to regulate the expression of each of its genes. This makes sense, 
inasmuch as the most efficient way to keep a gene from being expressed is to 
block the very first step— the transcription of its DNA sequence into an RNA 
molecule. 



Summary 

The translation of the nucleotide sequence of an mRNA molecule into protein takes 
place in the cytoplasm on a large ribonucleopro tein assembly called a ribosome. The 
amino adds used for protein synthesis are first attached to a family of tRNA 
molecules, each of which recognizes, by complementary base-pair interactions, par- 
ticular sets of three nucleotides in the mRNA (codons). The sequence of nucleotides in 
the mRNA is then read from one end to the other in sets of three according to the 
genetic code. 

7b initiate translation, a small rfbosomal subunit binds to the mRNA molecule 
at a start codon (AUG) that is recognized by a unique initiator tRNA molecule. A 
large rfbosomal subunit binds to complete the ribosome and begin the elongation 
phase of protein synthesis. During this phase, aminoacyl tRNAs—each bearing a 
specific amino acid bind sequentially to the appropriate codon in mRNA by forming 
complementary base pairs udth the tRNA anUcodoru Bach amino add is added to the 
C-terminal end of the growing polypeptide by means of a cycle of three sequential 
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Rgure 7-5 Six steps at which 
eucaryotic gene expression can be ' 
controlled. Controls that operate at 
stops I through 5 are discussed ip this 
chapter. Step 6, the regulation of protein 
activity, Includes reversible activation or 
in activation by protein phosphorylation' 
(discussed In Chapter 3) as well as 
Irreversible Inactivation by "proteolytic 
degradation (olscussed in Chapter 6)< 



Gene Expression Can Be Regulated at Many of die Steps 
in the Pathway from DNA to RNA to Protein 

If differences among the various cell types of an organism depend on the partic- 
ular genes that the cells express, at what level is the control of gene expression 
exercised? As we saw in the last chapter, there are many steps in the pathway 
leading from DNA to protein, and all of them can in principle be regulated Thus 
a cell can control the proteins it makes by (1) controlling when and how often a 
given gene is transcribed (transcriptional control}, (2) controlling how the RNA 
transcript is spliced or otherwise processed (RNA processing control}, (3) 
selecting which completed mRNAs In the cell nucleus are exported to the cytosol 
and determining where in the cytosol they are localized (RNA transport and 
localization control}, (4) selecting which mRNAs in the cytoplasm are translated 
by ribosomes (translational control), (5) selectively destabilizing certain mRNA 
molecules in the cytoplasm (mRNA degradation control), or (6} selectively acti- 
vating, inactivating, degrading, or compartmentalizing specific protein 
molecules after they have been made (protein activity control) (Figure 7-5). 

For most genes transcriptional controls are paramount This makes sense 
because, of all the possible control points illustrated in Figure 7-5, only tran- 
scriptional control ensures that the cell will not synthesize superfluous interme- 
diates. In the following sections we discuss the DNA and protein components 
that perform this function by regulating the initiation of gene transcription. We 
shall return at the end of the chapter to the additional ways of regulating gene 
expression. 

Summary 

■ 

The genome of a cell contains in its DNA sequence the information to make many 
thousands of different protein and RNA molecules. A cell typically expresses only a 
fraction of its genes, and the different types of cells in multicellular organisms arise 
because different sets of genes are expressed Moreover, cells can change the pattern 
of genes they express in response to changes in their environment such as signals 
from other cells. Although aU of the steps involved in expressing a gene can in prin- 
ciple be regulated, for most genes the initiation of RNA transcription is the most 
. important point of control 



DNA-BINDING MOTIFS IN GENE REGULATORY 
PROTEINS 

How does a cell determine which of its thousands of genes to transcribe? As 
mentioned briefly In Chapters 4 and 6, the transcription of each gene is con- 
trolled by a regulatory region of DNA relatively near the site where transcription 
; begins. Some regulatory regions are simple and act as switches that are thrown 
by a singe signal Many others axe complex and act as tiny microprocessors, 
*e$jponding to a variety of signals that they interpret and integrate to switch the 
neighboring gene on or off. Whether complex or simple, these switching devices 
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occur in the germ line, the cell lineage that gives rise to sperm or eggs. Most of 
the DNA in vertebrate germ cells is inactive and highly methylated. Over long 
periods of evolutionary time, the methylated CG sequences in these inactive . 
regions have presumably been lost through spontaneous deamination events 
that were not properly repaired. However promoters of genes that remain active 
in the germ cell lineages " (including most housekeeping genes) are kept 
unmethyiated, and therefore spontaneous dearmnations of Cs that occur with- 
in them can be accurately repaired Such regions are preserved in modem day 
vertebrate cells as CG islands. In addition, any mutation of a CG sequence in the r AA&3£HBaagm 
genome that destroyed the function or regulation of a gene in the adult would be 
selected against, and some CG islands are simply the result of a higher than nor- 
mal density of critical CG sequences. 

The mammalian genome contains an estimated 20,000 CG islands. Most of 
the islands marie the 5' ends of transcription units and thus, presumably, of 
genes. The presence of CG islands often provides a convenient way of identify- 
ing genes in the DNA sequences of vertebrate genomes. 



Summary 

The many types of cells in animals and plants are created largely, through m echa- 
nlsms that cause different genes to be transcribed in different cells. Since many 
specialized animal cells can maintain their unique character through many cell 
division cycles and even when grown in culture, the gene regulatory mechanisms 
involved in creating them must be stable once established and heritable when the 
ceRdtvides. These features endow the cell with a memory dfits developmental history. 
Bacteria and yeasts provide unusually accessible model systems in which to study 
gene regulatory mechanisms. One such mechanism involves a competitive interac- 
tion between two gene regu latory proteins, each of which inhibits the synthesis of the 
other; mis can create a flip-flop switch that switches a cell between two alternative 
patterns of gene expression. Direct or indirect positive feedback loops, which enable . 
gene regulatory proteins to perpetua te their own synthesis, provide a general mech- 
anism for ceil memory Negative feedback loops with programmed delays farm the 
basis for cellular clocks. 

hi eucaryotes the transcription of a gene is generally controlled by combinations 
of gene regulatory proteins. It is thought that each type of cell in a higher eucaryotic 
organism contains a specific combination of gene regulatory proteins that ensures 
the expression of only these genes appropriate to that type of cell A given gene regu- 
latory protein may be active in a variety of circumstances and typically is involved 
in the regulation of many genes. 

Dt addition to diffusible gene regulatory proteins, Inherited states of chromatin 
condensation are also used by eucaryotic cells to regulate gene expression. An espe- 
cially dramatic case is the inactwation of an entire X chromosome in female mam- 
mals. In vertebrates DNA methy lotion also functions in gene regulation, being used 
mainly as a device to reinforce decisions about gene expression that are made ini- 
tially by other mechanisms. DNA methylation also underlies the phenomenon of 
genomic imprinting in mammals, in which the expression of a gene depends on 
whether it was inherited from the mother or the father. 
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Figure 7-06 A mechanism to explain 
both the marked overall deficiency 
of CG sequences and their clustering 
Into CG islands In .vertebrate 
genomes. A bkek fine marks the location 
of a CG ^nucleotide in the DNA 
sequence, while a red "lollipop" indicates 
the presence of a methyl group on the 
CG diraideotide. CG sequences that lie In 
regulator/ sequences of genes that are 
transcribed in germ cells are unmethyiated 
and therefore tend to be retained In 
evolution. Methylated CG sequences, on 
the ofher hand, tend to be lost through 
deamination of 5- methyl C to T unless the 
GG sequence is critical for survival 



POSTTRANSCRIPTIONAL CONTROLS 

In principle, every step required for the process of gene expression could be 
controlled. Indeed, one can find examples of each type of regulation, although 
any one gene is likely to use only -a few of them. Controls on the initiation of 
gene transcription are the predominant form of regulation for most genes. But 
other controls can act later in the pathway from DNA to protein to modulate 
the amount of gene product that is made. Although these posttranscriptional 
controls, which operate after RNA polymerase has bound to the gene's promoter 
and begun RNA synthesis, are less common than transcriptional control, for 
many genes they are crucial 
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Regulation of transcription 



fjtf phenoiyplc differences Dial distinguish the 
prions kinds or it* Us in a higher eutaryute are 
largely due to" differences in the expression of 
-roes thnt code fnr proteins, thnt is. tlto.se Iran- 
tciibed hy UNA polymerase II. hi principle, the 
txpressioii or these gene* might lie regulaied at 
an v one of several .stages. The concept or the 
Hevel or control" implies that gene expression 
is pot necessarily an automatic process once it 
bas begun. II could \w regulated in a gene* 
specific way at any one or several sequential 
jirps. U> can (listing m$h <nt Irutt) tire poteti- 
lia! control points, forming Hie series: 

Activatimi of gene sirucuire 

Initial wn or transri ipinm 
i 

Procv«j"nig I lie irstuftTipj 
I 

transport to cvioplastn 
i 

Translation of mRNA 

Hie existence or I he iirsl Mep is implied hy 
ihe discovery that genes may exist in either of 
mo structural conditions, Itelaiive to the stale 
or most of the genome, genes are found itt 
mi -active" slate in the cells In which they 
are expressed (see Chapter 2X|. The change of 
structure 1$ distinct Crtittt the act oC transcrip* 
lion, and indicates thai Ihe gene is Iranscrib- 
able/ This suggests that acquisition of the 
•active* structure musi be the first step in gene 
expression. 

Transcription of a gene In the active stale is 



controlled at ihe stage of initiation, that Is. by 
the interaction of ANA polymerase with Us pro- 
moter, litis Is now heconilng susceptible to 
analysis In (lie /// ri//t> systems {see Chapter 
*a). for most genes, this is a major control 
point: prolubly it is Ihe most common level of 
regulation. 

There is at present no evidence for control 
ai subsequent stages of transcription in euLary- 
olic cells* for example* via anliiermlnaUon 
mechanisms. 

Toe primary transcript Is modified by capping 
at toe 5* etui, and usually also by potyadenyla- . 
lion at the 5* end. nitrons must 1* spliced out 
from the transcripts of iniemipied genes. The 
mature RNA must he exported from the nucleus 
to Hie cytoplasm. Regulation or gene expression 
hy selection of sequences at the tevef of nuclear 
fcSA might Involve any or ail of these stages, 
hut the vne Tor which we have most evidence 
concerns changes in splicing: some genes are 
expressed hy menus of itlterumive splicing pat- 
terns whose regulation controls (he type or pro- 
le fn product (see Chapter 30). 

Finally, the jranslntlon of au mKNA In the cyto- 
plasm can lie specifically controlled. There is little 
evidence for the employment of litis mechanism in 
adult somatic cells* bu| H tines occur In some 
embryonic sltua lions, as described lit 'Chapter #. 
"The mechanism is presumed to involve the Mock- 
ing or (nutation of iranslaiipit of some mfcs'As by 
specific protein factors. 

But having acknowledged thai control of gene 
expression can occur at multiple singes, and 
that production of RNA cannot inevitably be 
equated with production of protein, it Is clear 




;ic 



m I Chapter z9 



that the overwhelming majority or regulatory 
events occur at the initiation of transcription. 
Regulation or tissue-specific gene transcription 
lies at the heart of eukaryotic differentiation; 
indeed, we see examples in Chapter 38 in 
which proteins that regulate embryonic devel- 
opment prove to be transcription factors. A reg- 
ulatory transcription factor serves io provide 



common control of a large number of -target 
genes, and we seek to answer two questions 
about this mode of regulation: whaj identifies 
the common target genes to the transcription 
facto n and how is the activity or the transcrip- 
tion factor itself regulated in response to iutrui* 
sic or extrinsic signals? 



Response element 
regulation 



s identify genes under common 



The principle that emerges from characterizing 
groups of genes under common fconlrol is that 
they share a promoter dantni that is recognized 
by o rtgutatoo' Irunsoiption factor. An element 
that causes 4 gene to respond to such a factor 
is called a response element; examples are the 
HSE (beat shock response element), ORE 
(glucocorticoid response element), SRE (scrum 
response clement). 

The properties or some inducible transcription 
factors and me elements that they recognize are 
summarized in. Table 29,1. Kesponse elements 
have the same general charade rislics as 
upstream elements of promoters or enhancers. 
They contain short consensus sequences, and 
copies or the response elements found in dif- 
ferent genes are closely related, but not neces- 
sarily Identical. The region bound by the factor 
extends for- a short distance on either side or 



Table 29.1 IntfLiccte transcr:pi.on factors bir.d ic 
rcspor.so cl^mcnls that iCc-nt.fy rjreups cf nromsleis 
or cihancars *Ltject lo coordinate ccrifd. 



Regulatory Agent Module Consensus 



Factor 



Hetishotif HSE 

GtoOCOflfcott ORE 

Pnocboi BStv THE 

Serum SHE 



aWGAAHMTCCNNQ MSTF 

TCOTACAAATGTTC7 fiKvpior 

TGACTCA API 

CCATATTAGG SHF 



the consensus sequence, in promoters, the ele- 
ments are not present at fixed distances froro 
the siartpoint, but' are usually <200 bp upstream 
of 11. The presence of a single element usuaUr 
is sufficient to confer the- regulatory response, 
but sometimes there are multiple copies. 

Response elements may be located in P<°" 
rooters or in enhancers. Some types of etemenu 
are typically found in one rather than the other, 
usually an HSE Is found In a promoter, while ' 
ORE is found in an enhancer. We assume tn>* 
all response elements function by die **** 
general principle. A gene is regulated fc*' * 
sequence at tlie promoter or enhancer 
recognized by a specific protein. The $ u 
functions us o transcription factor 
fl/Vi polymerase to initiate* Active protein 
available only under conditions when the. 
eo be expressed; its absence means that tU P** 
mater- is not activated by this particular a*** 

An example of a situation in which 
genes are controlled by a single factor I* ^ 



vlded by the heat shock response. This 



is***: 



an» 



mo.n to a wide range of prokoryotes ^ 
eutaryotes and Involves multiple °° nl j^rr 
gene expression: an increase in temj* 13 ^ 



turns off transcription or some genes. iatn * f rf 
transcription of the heat shock S^^V^ 
causes changes in the translation of ^ 
The control or the heat shoe* genes Wf ^ 
the differences between prokarytH* ^ 
eukaryotic modes or control. In. bacteria, * ^ 
sigma factor is synthesized that precis ^ 
polymerase holoerrzyme to recognize ° n - 
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Abstract 

Background: Prostate stem cell antigen (PSCA) is a recently defined homologue of the Thy-l/Ly-6 family of 
glycosylphosphatidylinositol (GPI)-anchored cell surface antigens. The purpose of the present study was to 
examine the expression status of PSCA protein and mRNA in clinical specimens of human prostate cancer (Pea) 
and to validate it as a potential molecular target for diagnosis and treatment of Pea. 

Materials and Methods: Immunohistochemical (IHQ and in situ hybridization (ISH) analyses of PSCA 
expression were simultaneously performed on paraffin-embedded sections from 20 benign prostatic hyperplasia 
(BPH), 20 prostatic intraepithelial neoplasm (PIN) and 48 prostate cancer (Pea) tissues, including 9 androgen- 
independent prostate cancers. The level of PSCA expression was semiquantitative^ scored by assessing both the 
percentage and intensity of PSCA-positive staining cells in the specimens. Then .compared PSCA expression 
between BPH, PIN and Pea tissues and analysed the correlations of PSCA expression level with pathological grade, 
clinical stage and progression to androgen-lndependence in Pea. 

Results: In BPH and low grade PIN, PSCA protein and mRNA staining were weak or negative and less intense 
and uniform than that seen in HGPIN and Pea. There were moderate to strong PSCA protein and mRNA 
expression in 8 of 1 1 (72.7%) HGPIN and in 40 of 48 (83.4%) Pea specimens examined by IHC and ISH analyses, 
with statistical significance eompared with BPH (20%) and low grade PIN (22.2%) samples (p < 0.05, respectively). 
The expression level of PSCA increased with high Gleason grade, advanced stage and progression to androgen- 
independence (p < 0.05, respectively). In addition. (HC and ISH staining showed a high degree of correlation 
between PSCA protein and mRNA overexpression. 

Conclusions: Our data demonstrate that PSCA as a new cell surface marker is overexpressed by a majority of 
human Pea. PSCA expression correlates positively with adverse tumor characteristics, such as increasing 
pathological grade (poor cell differentiation), worsening clinical stage and androgen-in dependence, and 
speculatively with prostate carcinogenesis. PSCA protein overexpression results from upregulated transcription 
of PSCA mRNA. PSCA may have prognostic utility and may be a promising molecular target for diagnosis and 
treatment of Pea. 
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Introduction 

Prostate cancer (Pea) is the second leading cause of can- 
cer-related death in American men and is becoming a 
common cancer increasing in China. Despite recently 
great progress in the diagnosis and management of local- 
ized disease, there continues to be a need for new diagnos- 
tic markers that can accurately discriminate between 
indolent and aggressive variants of Pea. There also contin- 
ues to be a need for the identification and characterization 
of potential new therapeutic targets on Pea cells. Current 
diagnostic and therapeutic modalities for recurrent and 
metastatic Pea have been limited by a lack of specific tar- 
get antigens of Pea. 

Although a number of prostate-specific genes have been 
identified (i.e. prostate specific antigen, prostatic acid 
phosphatase, glandular kallikrein 2), the majority of these 
are secreted proteins not ideally suited for many immuno- 
logical strategies. So, the identification of new cell surface 
antigens is critical to the development of new diagnostic 
and therapeutic approaches to the management of Pea. 

Reiter RE et al [1) reported the identification of prostate 
stem cell antigen (PSCA), a cell surface antigen that is pre- 
dominantly prostate specific. The PSCA gene encodes a 
123 amino acid glycoprotein, with 30% homology to 
stem cell antigen 2 (Sea 2). Like Sca-2, PSCA also belongs 
to a member of the Thy-l/Ly-6 family and is anchored by 
a glycosylphosphatidylinositol (GPI) linkage. mRNA in 
situ hybridization (ISH) localized PSCA expression in nor- 
mal prostate to the basal cell epithelium, the putative 
stem cell compartment of prostatic epithelium, suggesting 
that PSCA may be a marker of prostate stem/progenitor 
cells. 

In order to examine the status of PSCA protein and mRNA 
expression in human Pea and validate it as a potential 
diagnostic and therapeutic target for Pea, we used immu- 
nohistochemistry (IHC) and in situ hybridization (ISH) 
simultaneously, and conducted PSCA protein and mRNA 
expression analyses in paraffin-embedded tissue speci- 
mens of benign prostatic hyperplasia (BPH, n - 20), pros- 
tate intraepithelial neoplasm (PIN, n = 20) and prostate 
cancer (Pea, n - 48). Furthermore, we evaluated the possi- 
ble correlation of PSCA expression level with Pea tumori- 
genesis, grade, stage and progression to androgen- 
independenee. 

Materials and methods 

Tissue samples 

All of the clinical tissue specimens studied herein were 
obtained from 80 patients of 57-84 years old by prostate- 
ctomy, transurethral resection of prostate (TURP) or biop- 
sies. The patients were classified as 20 cases of BPH, 20 
cases of PIN, 40 cases of primary Pea, including 9 patients 



with recurrent Pea and a history of androgen ablauon 
therapy (orchiectomy and/or hormonal therapy), who 
were referred to as androgen-independent prostate can- 
cers. Eight specimens were harvested from these andro- 
gen-independent Pea patients prior to androgen ablation 
treatment. Each tissue sample was cut into two parts, one 
was fixed in 10% formalin for IHC and the other treated 
with 4% paraformaldehyde/0. 1 M PBS PH 7.4 in 0.1% 
DEPC for 1 h for ISH analysis, and then embedded in par- 
affin. All paraffin blocks examined were then cut into 5 
|im sections and mounted on the glass slides specific for 
IHC and ISH respectively in the usual fashion. H&E- 
stained section of each Pea was evaluated and assigned a 
Gleason score by the experienced urological pathologist at 
our institution based on the criteria of Gleason score (2). 
Hie Gleason sums are summarized in Table 1. Clinical 
staging was performed according to Jewett-whitmore- 
prout staging system, as shown in Table 2. In the category 
of PIN, we graded the specimens into two groups, i.e. low 
grade PIN (grade I - II) and high grade PIN (HGPIN, 
grade III) on the basis of literatures [3,4]. 

ImmunoMstochemlcal (IHC) analysis 

Briefly, tissue sections were deparaffinized, dehydrated, 
and subjected to microwaving in 10 mmoI/L citrate 
buffer, PH 6.0 (Boshide, Wuhan, China) in a 900 W oven 
for 5 min to induce epitope retrieval. Slides were allowed 
to cool at room temperature for 30 min. A primary mouse 
antibody specific to human PSCA (Boshide, Wuhan, 
China) with a 1:100 dilution was applied to incubate with 
the slides at room temperature for 2 h. Labeling was 
detected by sequentially adding biotinylated secondary 
antibodies and strepavidin-peroxidase, and localized 
using 3,3'-diaminobenzidine reaction. Sections were then 
court terstained with hematoxylin. Substitution of the pri- 
mary antibody with phosphate-buffered-saline (PBS) 
served as a negative-staining control. 

mRNA in situ hybridization (ISH) 

Five-um-thick tissue sections were deparaffinized and 
dehydrated, then digested in pepsin solution (4 mg/ml in 
3% citric acid) for 20 min at 37.5 °C, and further proc- 
essed for ISH. Digoxigen in-labeled sense and antisense 
human PSCA RNA probes (obtained from Boshide, 
Wuhan, China) were hybridized to the sections at 48 °C 
overnight. The posthybridization wash with a high strin- 
gency was performed sequentially at 37° C in 2 x standard 
saline citrate (SSC) for 10 min, in 0.5 x SSC for 15 min 
and in 0.2 x SSC for 30 min. The slides were then incu- 
bated to biotinylated mouse anti-digoxigenin antibody at 
37.5 °C for 1 h followed by washing in 1 x PBS for 20 min 
at room temperature, and then to strepavidin-peroxidase 
at 37.5°C for 20 min followed by washing in 1 x PBS for 
15 min at room temperature. Subsequently, the slides 
were developed with diaminobenzidine and then coun- 
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Table 1: Correlation of PSCA expression with Gleason scoi 


re 




Intensity x frequency 


Gleason score 


0-6 (%) 


9(%) 


2-4 
5-7 
8-10 


5(83) 

19(79) 

5(28) 


1(17) 

5(21) 
13(72) 




Table 2: Correlation of PSCA expression with clinical stage 


Intensity * frequency 


Tumor stage 


0-6 (%) 


9(%) 


£B 

2>C 


27 (67.5) 
2(25) 


1 3 (32.5) 
6(75) 



terstained with hematoxylin to localize the hybridization 
signals. Sections hybridized with the sense control probes 
routinely did not show any specific hybridization signal 
above background. All slides were hybridized with PBS to 
substitute for the probes as a negative control. 

Scoring methods 

To determine the correlation between the results of PSCA 
immunostaining and mRNA in situ hybridization, the 
same scoring manners are taken in the present study for 
PSCA protein staining by 1HC and PSCA mRNA staining 
by ISH. Each slide was read and scored by two independ- 
ently experienced urological pathologists using Olympus 
BX-41 light microscopes. The evaluation was done in a 
blinded fashion. For each section, five areas of similar 
grade were analyzed semiquanutatively for the fraction of 
cells staining. Fifty percent of specimens were randomly 
chosen and rescored to determine the degree of interob- 
server and intraobserver concordance. There was greater 
than 95% intra- and interobserver agreement. 

The intensity of PSCA expression evaluated microscopi- 
cally was graded on a scale of 0 to 3+ with 3 being the 
highest expression observed (0, no staining; 1+, mildly 
intense; 2+, moderately intense; 3+, severely intense). The 
staining density was quantified as the percentage of cells 
staining positive for PSCA with the primary antibody or 
hybridization probe, as follows: 0 - no staining; 1 = posi- 
tive staining in <25% of the sample; 2 = positive staining 
in 25%-50% of the sample; 3 = positive staining in >50% 



of the sample. Intensity score (0 to 3+) was multiplied by 
the density score (0-3) to give an overall score of 0-9 
[1,5]. In this way, we were able to differentiate specimens 
that may have had focal areas of increased staining from 
those that had diffuse areas of increased staining [6]. The 
overall score for each specimen was then categorically 
assigned to one of the following groups: 0 score, negative 
expression; 1-2 scores, weak expression; 3-6 scores, mod- 
erate expression; 9 score, strong expression. 

Statistical analysis 

Intensity and density of PSCA protein and mRNA expres- 
sion in BPH, PIN and Pea tissues were compared using the 
Chi-square and Student's {-test. Univariate associations 
between PSCA expression and Gleason score, clinical 
stage and progression to androgen-independence were 
calculated using Fisher's Exact Test. For all analyses, p < 
0.05 was considered statistically significant. 

Results 

PSCA expression in BPH 

In general, PSCA protein and mRNA were expressed 
weakly in individual samples of BPH. Some areas of 
prostate expressed weak levels (composite score 1-2), 
whereas other areas were completely negative (composite 
score 0). Four cases (20%) of BPH had moderate expres- 
sion of PSCA protein and mRNA (composite score 4-6) 
by IHC and ISH. In 2/20 (10%) BPH specimens, PSCA 
mRNA expression was moderate (composite score 3-6), 
but PSCA protein expression was weak (composite score 
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2} in one and negative (composite score 0) in the other. 
PSCA expression was localized to the basal and secretory 
epithelial ceils, and prostatic stroma was almost negative 
staining for PSCA protein and mRNA in all cases 
examined. 

PSCA expression In PIN 

In this study, we detected weak or negative expression of 
PSCA protein and mRNA (£2 scores) in 7 of 9 (77.8%) 
low grade PIN and in 2 of 11 (18.2%) HGPIN, and mod- 
erate expression (3-6 scores) in the rest 2 low grade PIN 
and 5 of 1 1 (45.5%) HGPIN. One HGPIN with moderate 
PSCA mRNA expression (6 score) was found weak stain- 
ing for PSCA protein (2 score) by IHC. Strong PSCA pro- 
tein and mRNA expression (9 score) were detected in the 
remaining 3 of 1 1 (27.3%) HGPIN, There was a statisti- 
cally significant difference of PSCA protein and mRNA 
expression levels observed between HGPIN and BPH (p < 
0.05), but no statistical difference reached between low 
grade PIN and BPH (p > 0.05). 

PSCA expression In Pea 

In order to determine if PSCA protein and mRNA can be 
detected in prostate cancers and if PSCA expression levels 
are increased in malignant compared with benign glands, 
Forty-eight paraffin-embedded Pea specimens were ana- 
lysed by IHC and ISH. It was shown that 19 of 48 (39.6%) 
Pea samples stained very strongly for PSCA protein and 
mRNA with a score of 9 and another 21 (43.8%) speci- 
mens displayed moderate staining with scores of 4-6 (Fig- 
ure 1). In addition, 4 specimens with moderate to strong 
PSCA mRNA expression (scores of 4-9) had weak protein 
staining (a score of 2) by IHC analyses. Overall, Pea 
expressed a significantly higher level of PSCA protein and 
mRNA than any other specimen category in this study (p 
< 0.05, compared with BPH and PIN respectively). The 
result demonstrates that PSCA protein and mRNA are 
overexpressed by a majority of human Pea. 

Correlation of PSCA expression with Gleason score in Pea 

Using the semi-quantitative scoring method as described 
in Materials and Methods, we compared the expression 
level of PSCA protein and mRNA with Gleason grade of 
Pea, as shown in Table 1. Prostate adenocarcinomas were 
graded by Gleason score as 2-4 scores = well-differentia- 
tion, 5-7 scores = moderate-differentiation and 8-10 
scores = poor-differentiation [7], Seventy-two percent of 
Gleason scores 8-10 prostate cancers had very strong 
staining of PSCA compared to 21% with Gleason scores 
5-7 and 17% with 2-4 respectively, demonstrating that 
poorly differentiated Pea had significantly stronger 
expression of PSCA protein and mRNA than moderately 
and well differentiated tumors (p < 0.05). As depicted in 
Figure 1, IHC and ISH analyses showed that PSCA protein 
and mRNA expression in several cases of poorly differen- 



tiated Pea were particularly prominent, with more intense 
and uniform staining. The results indicate that PSCA 
expression increases significantly with higher tumor grade 
in human Pea. 

Correlation of PSCA expression with clinical stage In Pea 

With regards to PSCA expression in every stage of Pea, we 
showed the results in Table 2, Seventy-five percent of 
locally advanced and node positive cancers (i.e. C-D 
stages) expressed statistically high levels of PSCA versus 
32.5% that were organ confined (i.e. A-B stages) (p < 
0.05). The data demonstrate that PSCA expression 
increases significantly with advanced tumor stage in 
human Pea. 

Correlation of PSCA expression with androgen- 
independent progression of Pea 

All 9 specimens of androgen-independent prostate can- 
cers stained positive for PSCA protein and mRNA. Eight 
specimens were obtained from patients managed prior to 
androgen ablation therapy. Seven of eight (87.5%) of 
these androgen-independent prostate cancers were in the 
strongest staining category (score = 9), compared with 
three out of eight (37.5%) of patients with androgen- 
dependent cancers (p < 0.05). The results demonstrate 
that PSCA expression increases significantly with progres- 
sion to androgen-independence of human Pea. 

It is evident from the results above that within a majority 
of human prostate cancers the level of PSCA protein and 
mRNA expression correlates significantly with increasing 
grade, worsening stage and progression to androgen-inde- 
pendence. 

Correlation of PSCA Immunostalnlng and mRNA in situ 
hybridization 

In all 88 specimens surveyed herein, we compared the 
results of PSCA IHC staining with mRNA ISH analysis. 
Positive staining areas and its intensity and density scores 
evaluated by IHC were identical to those seen by ISH in 79 
of 88 (89.8%) specimens (18/20 BPH, 19/20 PIN and 42/ 
48 Pea respectively). Importantly, 27/27 samples with 
PSCA mRNA composite scores of 0-2, 32/36 samples 
with scores of 3-6 and 22/24 samples with a score of 9 
also had PSCA protein expression scores of 0-2, 3-6 and 
9 respectively. However, in 5 samples with PSCA mRNA 
overall scores of 3-6 and in 2 with scores of 9 there were 
less or negative PSCA protein expression (i.e. scores of 0- 
4), suggesting that this may reflect posttranscriptional 
modification of PSCA or that the epitopes recognized by 
PSCA mAb may be obscured in some cancers. The data 
demonstrate that the results of PSCA immunostaining 
were consistent with those of mRNA ISH analysis, show- 
ing a high degree of correlation between PSCA protein 
and mRNA expression. 
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Representatives of PSCA IHC and ISH saining in Pea (A. IHC staining, B. ISH staining. x200 magnification). A„ &,: negative con- 
trol of IHC and ISH. PBS replacing the primary antibody (A,) and hybridization with a sense PSCA probe (B,) showed no back- 
ground staining. A 2 , Bj: a moderately differentiated Pea (Gleason score = 3+3 = 6) with moderate staining (composite score - 
6) in all malignant cells; A 2 : IHC shows not only cell surface but also apparent cytoplasmic staining of PSCA protein. A 3 , B 3 : a 
pporly differentiated Pea (Gleason score = 4+4 = 8) with very strong staining (composite score = 9) in all malignant cells. 
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Discussion 

PSCA is homologous to a group of cell surface proteins 
that mark the earliest phase of hematopoietic develop- 
ment. PSCA mRNA expression is prostate-specific in nor- 
mal male tissues and is highly up-regulated in bom 
androgen-dependent and-independent Pea xenografts 
(LAPC-4 tumors). We hypothesize that PSCA may play a 
role in Pea tumorigenesis and progression, and may serve 
as a target for Pea diagnosis and treatment. In this study, 
IHC and ISH showed that in general there were weak or 
absent PSCA protein and mRNA expression in BPH and 
low grade PIN tissues. However, PSCA protein and mRNA 
are widely expressed in HGPIN, the putative precursor of 
invasive Pea, suggesting that up-regulation of PSCA is an 
early event in prostate carcinogenesis. Recently, Reiter RE 
etal [1], using ISH analysis, reported that97 of 118 (82%) 
HGPIN specimens stained strongly positive for PSCA 
mRNA. A very similar finding was seen on mouse PSCA 
(mPSCA) expression in mouse HGPIN tissues by Tran C. 
P et al [8], These data suggest that PSCA may be a new 
marker associated with transformation of prostate cells 
and tumorigenesis. 

We observed that PSCA protein and mRNA are highly 
expressed in a large percentage of human prostate cancers, 
including advanced, poorly differentiated, androgen- 
independent and metastatic cases. Fluorescence-activated 
cell sorting and confocal/ immunofluorescent studies 
demonstrated cell surface expression of PSCA protein in 
Pea cells [9]. Our IHC expression analysis of PSCA shows 
not only cell surface but also apparent cytoplasmic stain- 
ing of PSCA protein in Pea specimens (Figure 1). One pos- 
sible explanation for this is mat anti-PSCA antibody can 
recognize PSCA peptide precursors that reside in the cyto- 
plasm. Also, it is possible that the positive staining that 
appears in the cytoplasm is actually from the overlying 
cell membrane [5]. These data seem to indicate that PSCA 
is a novel cell surface marker for human Pea. 

Our results show that elevated level of PSCA expression 
correlates with high grade (i.e. poor differentiation), 
increased tumor stage and progression to androgen-inde- 
pendence of Pea. These findings support the original IHC 
analyses by Gu Z et al [9 ], who reported that PSCA protein 
expressed in 94% of primary Pea and the intensity of 
PSCA protein expression increased with tumor grade, 
stage and progression to androgen-independence. Our 
results also collaborate the recent work of Han KR et al 
[10], in which the significant association between high 
PSCA expression and adverse prognostic features such as 
high Gleason score, seminal vesicle invasion and capsular 
involvement in Pea was found. It is suggested that PSCA 
overexpression may be an adverse predictor for recur- 
rence, clinical progression or survival of Pea. Hara H et al 
{11J uscd RT-PCR detection of PSA, PSMA and PSCA in 1 
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ml of peripheral blood to evaluate Pea patients with poor 
prognosis. The results showed that among 58 PCa 
patients, each PCR indicated the prognostic value in the 
hierarchy of PSCA>PSA>PSMA RT-PCR, and extraprostatic 
cases with positive PSCA PCR indicated lower disease-pro- 
gression-free survival than those with negative PSCA PCR, 
demonstrating that PSCA can be used as a prognostic fac- 
tor. Dubey P et al [12] reported that elevated numbers of 
PSCA + cells correlate positively with the onset and devel- 
opment of prostate carcinoma over a long time span in 
the prostates of the TRAMP and PTEN +/- models com- 
pared with its normal prostates. Taken together with our 
present findings, in which PSCA is overexpressed from 
HGPIN to almost frank carcinoma, it is reasonable and 
possible to use increased PSCA expression level or 
increased numbers of PSCA-positive cells in the prostate 
samples as a prognostic marker to predict the potential 
onset of this cancer. These data raise the possibility that 
PSCA may have diagnostic utility or clinical prognostic 
value in human Pea. 

The cause of PSCA overexpression in Pea is not known. 
One possible mechanism is that it may result from PSCA 
gene amplification. In humans, PSCA is located on chro- 
mosome 8q24.2 [1], which is often amplified in meta- 
static and recurrent Pea and considered to indicate a poor 
prognosis [13-15). Interestingly, PSCA is in close proxim- 
ity to the c-myc oncogene, which is amplified in >20% of 
recurrent and metastatic prostate cancers [16,17]. Reiter 
RE etal [18] reported that PSCA and MYC gene copy num- 
bers were co-amplified in 25% of tumors (five out of 
twenty), demonstrating that PSCA overexpression is asso- 
ciated with PSCA and MYC coamplification in Pea. Gu Z 
et al [9] recently reporteted that in 102 specimens availa- 
ble to compare the results of PSCA immunostaining with 
their previous mRNA ISH analysis, 92 (90.2%) had iden- 
tically positive areas of PSCA protein and mRNA expres- 
sion. Taken together with our findings, in which we 
detected moderate to strong expression of PSCA protein 
and mRNA in 34 of 40 (85%) Pea specimens examined 
simultaneously by IHC and ISH analyses, it is demon- 
strated that PSCA protein and mRNA overexpressed in 
human Pea, and that the increased protein level of PSCA 
was resulted from the upregulated transcription of its 
mRNA. 

At present, the regulation mechanisms of human PSCA 
expression and its biological function are yet to be eluci- 
dated. PSCA expression may be regulated by multiple fac- 
tors [18]. WatabeT et al [19] reported that transcriptional 
control is a major component regulating PSCA expression 
levels. In addition, induction of PSCA expression may be 
regulated or mediated through cell-cell contact and pro- 
tein kinase C (PKC) [20]. Homologues of PSCA have 
diverse activities, and have themselves been involved in 
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carcinogenesis. Signalling through SCA-2 has been dem- 
onstrated to prevent apoptosis in immature thymocytes 
[21 J. Thy-1 is involved in T cell activation and transducts 
signals through src-like tyrosine kinases [22]. Ly-6 genes 
have been implicated both in tumorigenesis and in cell- 
cell adhesion [23-25 ]. Cell-cell or cell-matrix interaction is 
critical for local tumor growth and spread to distal sites. 
From its restricted expression in basal cells of normal 
prostate and its homology to SCA-2, PSCA may play a role 
in stem/progenitor cell function, such as self-renewal (i.e. 
anti-apoptosis) and/or proliferation [1]. Taken together 
with the results in the present study, we speculate that 
PSCA may play a role in tumorigenesis and clinical pro- 
gression of Pea through affecting cell transformation and 
proliferation. From our results, it is also suggested that 
PSCA as a new cell surface antigen may have a number of 
potential uses in the diagnosis, therapy and clinical prog- 
nosis of human Pea. PSCA overexpression in prostate 
biopsies could be used to identify patients at high risk to 
develop recurrent or metastatic disease, and to discrimi- 
nate cancers from normal glands in prostatectomy sam- 
ples. Similarly, the detection of PSCA-overexpressing cells 
in bone marrow or peripheral blood may identify and pre- 
dict metastatic progression better than current assays, 
which identify only PSA-positive or PSMA-positive pros- 
tate cells. 

In summary, we have shown in this study that PSCA pro- 
tein and mRNA are maintained in expression from 
HGPIN through all stages of Pea in a majority of cases, 
which may be associated with prostate carcinogenesis and 
correlate positively with high tumor grade (poor cell dif- 
ferentiation), advanced stage and androgen-independent 
progression. PSCA protein overexpression is due to the 
upregulation of its mRNA transcription. The results sug- 
gest that PSCA may be a promising molecular marker for 
the clinical prognosis of human Pea and a valuable target 
for diagnosis and therapy of this tumor. 
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Abstract 

Translation initiation Is regulated in response to 
nutrient availability and mJtogenlc stimulation and Is 
coupled with ceO cycle progression and ceO growth. 
Several alterations In translations] control occur In 
cancer. Variant mRNA sequences can alter the 
transtenonai efficiency of Individual mRNA molecules, 
which in turn play a role in cancer biology. Changes In 
the expression or availability of components of the 
transiational machinery and In the activation of 
translation through signal transduction pathways can 
lead to more global changes, such as an increase In 
the overall rate of protein synthesis and transiational 
activation of the mRNA molecules Involved in ceil 
growth and proliferation. We review the basic 
principles of transiational control, the alterations 
encountered In cancer, and selected therapies 
targeting translation initiation to help elucidate new 
therapeutic avenues. 

Introduction 

The fundamental principle of molecular therapeutics (n can- 
cer Is to exploit the differences In gene expression between 
cancer cells and normal cells. With the advent of cONA array 
technology, most efforts have concentrated on Identifying 
differences In gene expression at the level of mRNA, which 
can be attributable either to DNA amplification or to differ- 
ences in transcription. Gene expression Is quite complicated, 
however, and Is also regulated at the level of mRNA stability, 
mRNA translation, and protein stability. 

The power of transiational regulation has been best recog- 
nized among developmental biologists, because transcription 
does not occur In early embryogenesis in eukaryotes. For ex- 
ample, in Xsnopus, the period of transcriptional quiescence 
continues untS the embryo reaches mtablastula transition, the 
4000-ceU stage. Therefore, ail necessary mRNA molecules are 
transcribed during oogenesis and stockpiled in a transfationaJly 
Inactive, masked form. The mRNA are translationafly activated 
at appropriate times during oocyte maturation, fertilization, and 



early embryogenesis and thus, are under strict transiational 
control. 

Translation has an established role In cell growth. Basi- 
* caity, an Increase In protein synthesis occurs as a conse- 
quence of mrtogenesis. Until recently, however, little was 
known about the alterations In mRNA translation in cancer, 
and much Is yet to be discovered about their role in the 
development and progression of cancer. Here we review the 
baste principles of transiational control, the alterations en- 
countered in cancer, and selected therapies targeting transla- 
tion Initiation to elucidate potential new therapeutic avenues. 

Basic Principles of Transiational Control 
Mechanism of Translation Initiation 
Translation Initiation te the main step In transiational regulation. 
Translation Initiation is a cornptex process In which the 
tRNA and the 40$ and 60S riboeomaJ subunits are recruited to 
the 5' end of a mRNA molecule and assembled by eukaryotic 
translation initiation factors Into an 80S ribosorne at the start 
codon of tiie mRNA (Fig. 1). The 5' end of eukaryotic mFWA Is 
capped, /.&, contains the cap structure m 7 GpppN (J-ir^hy\- 
guarwsine-triptospr^ Most translation In 

eukaryotes occurs in a cap-dependent fashion, La, the cap is 
specifically recognized by the elF4E, 3 which binds the 5' cap. 
The eJF4F translation Initiation complex Is then formed by the 
assembly of eiF4E, the RNA heflcase e)F4A, and e!F4G, a 
scaffolding protein that mediates the binding of the 40S ribo- 
somal subunft to the mRNA molecule through Interaction with 
the eiF3 protein present on the 40S ribosoma eiF4A and elF4B 
participate in melting the secondary structure of the 5' UTR of 
the mRNA. The 43S initiation complex (40S/elF2/Met-tRNA/ 
GTP complex) scans the mRNA in a 5'-*' direction until it 
encounters an AUG start codon. This start codon Is then base- 
paired to the anticodon of initiator tRNA, forming the 48$ initi- 
ation complex The Initiation factors are then displaced from the 
48S complex, and the 60S ribosorne Joins to form the 80$ 
ribosorne. 

Unlike most eukaryotic translation, translation Initiation of 
certain mRNAs, such as the plcornavtrus RNA, Is cap Inde- 
pendent and occurs by Internal rfoosome entry. This mecha- 
nism does not require eiF4E Bther the 43S complex can bind 
the initiation codon directly through Interaction with the IRES in 
the 5' UTR such as in the encephaJomyocarditis virus, or it can 
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Fig. 7. Translation initiation In eukaiyotes. The 4E-BP& are hyperphcs- 
phorytated to release eiF4E sothat it can Interact with the 5' cap, and the 
01F4F initiation complex b assembled. The Interaction of po*y(A) binding 
protein with the initiation complex and drcuJarizalion of the mRNA (a not 
depicted in the diagram. The secondary structure ofthe 5' ITTRb melted, 
ma 40S rtoosomal su burnt is bound td eiF3, and the ternary complex 
consisting of elF2, OTP. and the MeMRNAare recruited to the mRNA. The 
rflxwome scans the mRNA In a 5'-*3' direction untfl an AUG start codon 
k found In the appropriate sequence context The initiation motors are 
released, and the large noosomal subunit is recruited. 



Initially attach to the IRES and then reach the Initiation codon by 
scanning or transfer, as is the case with the poliovinjs (1). 

Regulation of Translation initiation 
Translation initiation can be regulated by alterations in the 
expression or phosphorylation status of the various factors 
Involved. Key components In translations! regulation that 
may provide potential therapeutic targets follow. 

elF4E. elF4E plays a central role In translation regulation, 
ft Is the least abundant of the Initiation factors and is con- 
sidered the rate-limiting component for Initiation of cap- 
dependent translation. elF4E may also be Involved in mRNA 
splicing. mRNA 3' processing, and mRNA nucteocytoptes- 
mlc transport (2). elF4E expression can be Increased at the 
transcriptional level In response to serum or growth factors 
(3). elP4E overexpresslon may cause preferential translation 
of mRNAs containing excessive secondary structure In their 
5* UTR that are normally discriminated against by the trans- 



lationaJ machinery and thus are inefficiently translated (4-7). 
As examples of this, overexpresslon of elF4E promotes In- 
creased translation of vascular endothelial growth factor, 
fibroblast growth fector-2, and cyclin 01 (2, 8, 9). 

Another mechanism of control is the regulation of elF4E 
phosphorylation. elF4E phosphorylation Is mediated by the 
mftogen-actlvated protein Wnase-lnteracting kinase 1 r which 
Is activated by the mrtogen-activated pathway activating 
extracellular signal-related kinases and the stress-activated 
pathway acting through p38 mftog en-activated protein ki- 
nase (10-13). Several mitogens, such as serum, pJatetet- 
derived growth factor, epidermal growth factor, insulin, 
angiotensin II, src kinase overexpresslon, and ras over- 
expression, lead to elF4E phosphorylation (14). The phos- 
phorylation status of elF4E Is usually correlated with the 
translatfonal rate and growth status of the cell; however; 
elF4E phosphorylation has also been observed in response 
to some cellular stresses when translations! rates actually 
decrease (15). Thus, further study is needed to understand 
the effects of eiF4E phosphorylation on elF4E activity. 

Another mechanism of regulation Is the alteration of e!F4E 
availability by the binding of eIF4E to the eiF4E-bInding pro- 
teins (4E-BP, also known as PHAS-I). 4E-BPs compete with 
elF4Q for a binding site In elF4E. The binding of elF4E to the 
best characterized elF4E«binding protein, 4E-BP1, is regu- 
lated by 4E-B PI plx>sphorytation, Hypoprttsphorytatod 4E- 
BP1 binds to elF4E, whereas 4E-BP1 rryperprxxsphorylatlon 
decreases this binding. Insulin, angiotensin, epidermal 
growth factor, platelet-derived growth factor, hepatocyte 
growth factor, nerve growth factor, Insulin-like growth factors 
1 and II, InterleuWn 3, ^rahulocyte-rnacrophage colony-stim- 
ulating factor + steel factor, gastrin, and the adenovirus have 
ail been reported to induce phosphorylation of 4E-BP1 and 
to decrease the ability of 4E-BP1 to bind elF4E (15, 16). 
Conversely, deprivation of nutrients or growth factors results 
In 4E-BP1 dephosphorylation, an increase In elF4E binding, 
and a decrease In cap-dependent translation. 

p70 S6 Kinase. Phosphorylation of ribosomal 40S protein 
S8 by S6 K is thought to play an Important role In translatfonal 
regulation. S6K mouse embryonic cells proliferate more 
slowly than do parental cells, demonstrating that S6K has a 
positive Influence on cell proliferation (1 7). S6K regulates the 
translation of a group of mRNAs possessing a 5' terminal 
oOgopyrimidlne tract (5' TOP) found at the 5' UTR of ribosomal 
protein mRNAs and other mRNAs coding for components of 
the translations! machinery. Phosphorylation of S6K Is regu- 
tated In part based on the availability of nutrients (18, 19) and is 
stimulated by several growth factors, such as ptatetet-dertved 
growth factor and insulin-like growth factor i (20). 

eJF2a Phosphorylation. The binding of the Initiator tRNA 
to the small ribosomal unit Is mediated by translation Initia- 
tion factor eiF2. Phosphorylation of the a-subunit of elF2 
prevents formation of the elF2/QTP/Met-tRNA complex and 
inhibits global protein synthesis (21, 22). elP2a is phospho- 
ryiated under a variety of conditions, such as viral Infection, 
nutrient deprivation, heme deprivation, and apoptosls (22). 
elF2« is phosphoryfated by heme-regulated Inhibitor, nutrient- 
regulated protein kinase, and the IFN-induced, double- 
stranded RNA-activated protein kinase (PKR; Ref. 23). 



The mTOR Signaling Pathway. The macro! [de ai 
rapamycin (SiraBmus; Wyeth-Ayerst Research, Coflegeville, 
PA) has been the subject of intensive study because ft in- 
hibits signal transduction pathways Involved In T-ceU activa- 
tion. The rapamycln-sensffive component of these pathways 
Is mTOR (also called FRAP or RAFT1). mTOR Is the mam- 
malian homologue of the yeast TOR proteins that regulate Q 1 
progression and translation In response to nutrient avallabil- 
Ity (24). mTOR is a serine-threonine kinase that modulates 
translation Initiation by altering the phosphorylation status of 
4E-BP1 and S6K (Fig. 2; Ref. 25). 

4E-BP1 Is phosphorylated on multiple residues. mTOR phos- 
phoryiates the Thr-37 and Thr-46 residues of 4E-BP1 In vitro 
(26); however, phosphorylation at these sites Is not associated 
with a loss of etF4E binding. Phosphorylation of lhr-37 and 
Thr-46 Is required for subsequent phosphorylation at severe) 
COOH-termlna), serum-sensitive sites; a comWnation of these 
ptosphoryfatton events appears to be needed to Inhibit the 
binding of 4E-BP1 to eF4E£5). The product of the ATM gene, 
P38/MSK1 pathway, and protein kinase Co- also play a rote In 
4E-BP1 phosphorylation (27-29). 

S6K and 4E-BP1 are also regulated, in part, by PI3K and its 
downstream protein kinase Akt PTB4 Is a phosphatase that 
negatively regulates PI3K signaling. PTEN nuO ceils have 
constitutively active of Akt, with Increased S8K activity and 
86 phosphorylation (30), S6K activity Is inhibited both by 
PI3K inhibitors wortmannln and LY294002 and by mTOR 
inhibitor rapamycin (24). Akt phosphorytates Ser-2448 In 
mTOR In vitro, and this site is phosphorylated upon Akt 
activation In vivo (31-33). Thus, mTOR is regulated by the 
PI3K/Akt pathway; however, this does hot appear to be the 
only mode of regulation of mTOR activity. Whether the PI3K 
pathway also regulates S6K and 4E-BP1 phosphorylation 
Independent of mTOR Is controversial. 

Interestingly. mTOR autophosphotylation is blocked by wort- 
mannln but not by rapamycin (34). This seeming hrconsfetency 
suggests that mTOR-responsfve regulation of 4E-BP1 and S6K 
acuity oceans through am Intrinsic mTOR 

khaseacth^.Anattematepamwayfbr4E-BP1 and S6K phos- 
phorylation by mTOR activity is by the inhibition of a phospha- 
tase. Treatment with calycuBn A, an inhibitor of phosphatases 1 
and 2A, reduces rapamycirHnduced dephosphorylation of 4E- 
BP1 and S6K by rapamycin (35). PP2A interacts with full-length 
S6Kbut not with a S6K mutant that is resistant to dephospho- 
rylation resulting from rapamycin. mTOR phosphorytates PP2A 
In vitro; however, how this process alters PP2A activity Is not 
known. These results are consistent with the model that phos- 
phorylation of a phosphatase by mTOR prevents dephospho- 
rylation of 4E-BP1 and S6K, and conversely, that nutrient dep- 
rivation and rapamycin block inhibition of the phosphatase by 
mTOR. 

Polyadenytatfoiu The poly(A) tail in eukaryotic mRNA is 
important (n enhancing translation Initiation and mRNA sta- 
bility. PolyadenylarJon plays a key role in regulating gene 
expression during oogenesis and early embryogenesls. 
Some mRNA that are trans laterally inactive in the oocyte are 
polyadenylated concomitantly with translatjonal activation in 
oocyte maturation, whereas other mRNAs that are transla- 
tionalty active during oogenesis are deadenylated and trans- 
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lationalty silenced (38-38). Thus, control of poly (A) tall syn- 
thesis is an important regulatory step in gene expression. 
The 5' cap and poly(A) tall are thought to function synergfe- 
tteally to regulate mRNA translatkwiaJ efficiency (39, 40). 

RNA Packaging. Most RNA-bindlng proteins are assem- 
bled on a transcript at the time of transcription, thus deter- 
mining the translations fate of the transcript (41). A highly 
conserved family of Y-box proteins Is found in cytoplasmic 
messenger ribonucleo protein particles, where the proteins 
are thought to play a role In restricting the recruitment of 
mRNA to the translatjonal machinery (41-43). The major 
mRNA-associated protein, YB-1 , destabilizes the Interaction 
of elF4E and the 5' mRNA cap In vitro, and overexpression of 
YB-1 results In translatjonal repression In vfvo (44). Thus, 
alterations in RNA packaging can also play an important role 
in translations] regulation. 

Translation Alterations Encountered in Cancer 

Three main alterations at the translatlonal level occur In cancer: 
variations in mRNA sequences that increase or decrease trans- 
lations] efficiency, changes in the expression or availability of 
components of the transnational machinery, and activation of 
translation through aberrantly activated signal transduction 
pathways. The first alteration affects the translation of an Indi- 
vidual mRNA that may play a role In carcinogenesis. The sec- 
ond and third alterations can lead to more global changes, such 
as an increase In the overafi rate of protein synthesis, and the 
translation aJ activation of several mRNA species. 

Variations In mRNA Sequence 
Variations in mRNA sequence affect the translational effl- 
clencyofthe transcript A brief description of these variations 
and examples of each mechanism follow. 

Mutations. Mutations In the mRNA sequence, especially 
in the 5' UTR, can alter its translational efficiency, as seen In 
the following examples. 
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c-myc Saito ef a/, proposed that translation of fulMength 
o-myc is repressed, wheraas in several Burkitt lymphomas 
that have deletions of the mRNA 5' UTR, translation cfomyc 
b more efficient (46). More recently, it was reported that the 
5* UTR of omyc contains an IRES, and thus c-myc transla- 
tion can be Initiated by a cap-Independent as well as a 
cap-dependent mechanism (46, 47). In patients with multiple 
myeloma, a C-»T mutation In the omyc IRES was identified 
(48) and found to cause an enhanced Initiation of translation 
via internal ribosomal entry (49). 

BRCA1. A somatic point mutation (117 G-*C) In position 
-3 with respect to the start codon of the BRCA1 gene was 
Identified In a highly aggressive sporadic breast cancer (50). 
Chimeric constructs consisting of the wild-type or mutated 
BRCA1 5' UTR and a downstream luciferase reporter dem- 
rostrated a decrease In the translatlonaleffidencywiththeS' 
UTR mutation. 

Cycffn-ttopendent Kinase Inhibitor 2A. Some Inherited 
melanoma kindreds have a Q->T transverston at base -34 
of cycfovdependent kinase Inhibitor^ which encodes a 
cyclin-dependent kinase 4/cydln-dependent kinase 6 kinase 
Inhibitor Important In G, checkpoint regulation (51). This 
mutation,.glves rise to a novel AUG translation Initiation 
codon, creating an upstream open reading frame that com- 
petes for scanning ribosomes ami decreases translation 
from the wfld-type AUG. 

Alternate Splicing and Alternate Transcription Start 
Sites. Alterations In splicing and alternate transcription sites 
can lead to variations in 5' UTR sequence, length, and second- 
ary structure, ultimately Impacting transJational efficiency. 

ATM The ATM gene has four noncocfing axons In its 5' 
UTR that undergo extensive alternative splicing (52). The 
contents of 12 different 5' LTTRs that show considerable 
diversity in length and sequence have been Identified. These 
cflvergent 5' leader sequences play an Important role In the 
translations! regulation of the ATM gene. 

mdm. In a subset of tumors, overexpresston of the onco- 
protein mdm2 results in enhanced translation of the mdm2 
mRNA. Use of different promoters leads to two mdm2 tran- 
scripts that differ only in their 5' leaders (53). The longer 5' 
UTR contains two upstream open reading frames, and this 
mRNA Is loaded with ribosomes inefficiently compared with 
the short 5' UTR. 

BRCA1. In a normal mammary gland, BRCA1 mRNA is 
expressed with a shorter leader sequence (5'UTRa), whereas 
In sporadic breast cancer tissue, BRCA1 mRNA Is expressed 
with a longer leader sequence (5 r UTRb); the translations 
efficacy of transcripts containing 5' UTRb is 10 times lower 
than that of transcripts containing 5' UTRa (54). 

TGF-pZ TGF-03 mRNA includes a 1.1-kb 5' UTR, which 
exerts an Inhibitory effect on translation. Many human breast 
cancer cell lines contain a novel TGF-&3 transcript with a 5' 
UTR that Is 870 nucleotides shorter and has a 7-fbld greater 
translation^ efficiency than the normal TGF-p3 mRNA (55). 

Alternate Polyadenyfation Sites. Multiple poly ad any l- 
atlon signals leading to the generation of several transcripts 
with differing 3' UTR have been described for several mRNA 
species, such as the RET proto-oncogene (56), ATM gene 
(52), tissue inhibitor of metalloprotelnases-3 (57), RHOA 



proto-oncogene (58), and calmodulin-! (59). Although the 
effect of these alternate 3' UTRs on translation Is not yet 
known, they may be Important In RNA-proteln Interactions 
that affect translational recruitment The role of these alter- 
ations In cancer development and progression Is unknown. 

• 

Alterations In the Components of the 
Translation Machinery 

Alterations In the components of translation machinery can 
take many forms. 

Overexpresssion of e!F4EL OverexpressJon of eIF4E 
causes malignant transformation In rodent cells (60) and the 
deregulation of HeLa ceil growth (61). Pofunovsky etaL (62) 
found that eiF4E overexpresston substitutes for serum and 
Individual growth factors In preserving viability of fibroblasts, 
which suggests that eIRE can mediate both proliferative and 
survival signaling. 

Elevated levels of elF4E mRNA have been found In a broad 
spectrum of transformed cefl lines (63). e!F4E levels are 
elevated in all ductal carcinoma In situ specimens and Inva- 
sive ductal carcinomas, compared with benign breast spec- 
imens evaluated with Western blot analysts (64, 65). Prelim- 
inary studies suggest that this overexpresston fe attributabie 
to gene amplification (66). 

There are accumulating data suggesting that eF4e overex- 
presston can be valuable as a prognostic marker. elF4E over- 
expression was found In a retrospective stu<tytobeamarkerof 
poor prognosis in stages I to III breast carcinoma (67). Verifica- 
tion erf the prognostic value of elF4E in breast cancer Is now 
under way In a prospective trial (67). However; In a different 
study, etf=4E expression was correlated with the aggressive 
behavior of norhHodgkln's lymphomas (68). In a prospective 
analysis of patients with head and neck cancer, elevated levels 
of eIF4E In histologically tumor-free surgical margins predicted 
a significantly increased risk of locatogtonaJ recurrence (9). 
These results all suggest that e!F4E overexpresston can be 
used to select patients who might benem from more aggressive 
systemic therapy. Furthermore, the head and neck cancer data 
suggest that e!F4E overexpresston is a field defect and can be 
used to guide local therapy. 

Alterations in Other Initiation Factors. Alterations in a 
number of other initiation factors have been associated with 
cancer* Overproduction of elF4G, similar to elF4E, leads to 
malignant transformation in vfov (69). e1F-2a Is found In 
increased levels in bronchloloalveolar carcinomas of the lung 
(3). initiation factor eJF-4A1 is overexpressed In melanoma 
(70) and hepatocellular carcinoma (71). The p40 subunit of 
translation initiation factor 3 is amplified and overexpressed 
In breast and prostate cancer (72), and the elF3-p1 1 0 subunit 
Is overexpressed in testicular seminoma (73). The role that 
overexpresston of these Initiation factors plays on the devel- 
opment and progression of cancer, If any, Is not known. 

Overexpresston of S6K. S6K Is amplified and highly 
overexpressed in the MCF7 breast cancer cell line, com- 
pared with normal mammary epithelium (74). In a study by 
Bariund ef a/. (74), S6K was amplified in 59 of 668 primary 
breast tumors, and a statistically significant association was 
observed between amplification and poor prognosis. 
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Overexpressfon of PAP. PAP catalyzes 3' poly(A) syn- 
thesis. PAP Is overexpressed In human cancer cells com- 
pared with norma! and vf rally transformed cells (75). PAP 
enzymatic activity In breast tumors has been correlated with 
PAP protein levels (76) and, In mammary tumor cytosote, was 
found to be an Independent factor for predicting survival (76). 
little Is known, however, about how PAP expression or ac- 
tivity affects the translational profile. 

Alterations In RNA-blndlng Proteins. Even less is known 
about alterations fri RNA packaging In cancer. Increased ex- 
pression and nuclear localization of the RNA-Wndlng protein 
YB-1 are IncScators of a poor prognosis for breast cancer (77), 
non-smal ceQ lung cancer (78), and ovarian cancer (79). How- 
ever, this effect may be mediated at least tn part a! the level of 
transcription, because YB-1 Increases chemoresfctance by en- 
hancing the transcription of a muftkfcug resistance gene (80). 

Activation of Signal Transduction Pathways 
Activation of signal transduction pathways by loss of tumor 
suppressor genes or overexpresslon of certain tyrosine kinases 
can contribute to the growth and aggressiveness of tumors. An 
Important .rnutant In human cancers Is the tumor suppressor 
gene P7BV, which leads to the activation of the R3K/Akt path- 
way. Activation erf P13K 

formation of chicken embryo fibroblasts. The transformed ceils 
show constitutive phosphorylation of S6K and of 4E-BP1 (81). 
A mutant AW that retains kinase activity but does not phos- 
phorylate S6K or 4E-BP1 does not transform fibroblast^ whfc^ 
suggests a correlation between the oncogenicity of P13K and 
Akt and the phosphorylation of S6K and 4E-BP1 (81). 

Several tyrosine kinases such as platelet-derived growth 
factor, insulin-tike growth factor, HER2/heu, and epidermal 
growth factor receptor are overexpressed In cancer. Be- 
cause these kinases activate downstream signal transduc- 
tion pathways known to after translation Initiation, activation 
of translation is likely to contribute to the growth and aggres- 
siveness of these tumors. Furthermore, the mRNA for many 
of these kinases themselves are under transnational control. 
For example, HER2/neu mRNA Is translate naily controlled 
both by a short upstream open reading frame that represses 
HER2/neu translation in a cell type-Independent manner and 
by a distinct ceil type-dependent mechanism that Increases 
translational efficiency (82). HER2/neu translation Is different 
In transformed and normal cells. Thus, It Is possible that 
alterations at the translational level can In part account for 
the discrepancy between HER2/neu gene amplification de- 
tected by fluorescence In situ hybridization and protein levels 
detected by immunohistochemlcal assays. 



Translation Targets of Selected Cancer Therapy 

Components of the translation machinery and signal path- 
ways involved In the activation of translation Initiation repre- 
sent good targets for cancer therapy. 

Targeting trie mTOR Signaling Pathway: Rapamydn 
and Tumstatin 

Rapamycln Inhibits the proliferation of lymphocytes. It was 
initially developed as an Immunosuppressive drug for organ 



• 

transplantation. Rapamydn with FKBP 12 {FK508-Wndlng 
protein, M r 12,000) binds to mTOR to Inhibit Its function. 

Rapamydn causes a small but significant reduction In the 
Initiation rate of protein synthesis (83). It blocks cell growth In 
part by blocking S6 phosphorylation and selectively sup- 
pressing the translation of 5' TOP mRNAs, such as rlbosomal 
proteins, and elongation factors (83-85). Rapamydn also 
blocks 4E-BP1 phosphorylation and Inhibits cap-dependent 
but not cap-Independent translation (17, 88). 

The rapamydn-sensitive signal transduction pathway, acti- 
vated during malignant transforrrtfion arri canc^ 
Is now being studied as a target for cancer therapy (87). Pros- 
tate, breast; smafl oe8 lung, gBobJastorra, melanoma, and T<80 
leukemia are among the cancer lines most sensitive to the 
rapamydn analogue CCt-779 (Wyeih-Ayerst Research; Ref. 
87). In rtiabobmytx)sarcorna cell fines, raparnycin to either cyto- 
static or cytoddal, depending on the p53 status of the cell; p53 
wflcWype cells treated with rapamycln arrest In the phase 
and maintain their viability, whereas p53 mutant cafe accumu- 
late In ^ and undergo apoptosis (88, 89). In a recently reported 
study using human primitive neuroectodermal tumor and 
medullobfastoma models, rapamydn exhibited more cytotox- 
icity in c»mWrration with cisplatin and camptothecin than as a 
single agent in vivo, 00-779 delayed growth of xenografts by 
160% after 1 week of therapy and 240% after 2 weeks. Asingle 
high-dose administration caused a 37% decrease in tumor 
volume. Growth Inhibition tn vivo was 1-3 fines greater, with 
cispiatin in a>mb(nation with C0779 than with dsptetin alone 

(90) . Thus, preclinical studies suggest that rapamydn ana- 
logues are useful as single agents and to combination with 
chemotherapy. 

Rapamydn analogues CCI-779 and RAD001 (Novartls, 
Basel, Switzerland) are now In clinical trials. Because of the 
known effect of rapamydn on lymphocyte proliferation, a 
potential problem with rapamycln analogues Is Immunosup- 
pression. However, although prolonged Immunosuppression 
can result from rapamydn and CCI-779 administered on 
continuous-dose schedules, the Immunosuppressive effects 
of rapamydn analogues resolve in ~24 h after therapy 

(91) . The principal toxicities of CCI-779 have induded der- 
matdoglcal toxicity, mydosuppresslon, infection, mucositis, 
diarrhea, reversible elevations in liver function tests, hyper- 
glycemia, hypokalemia, hypocalcemia, and depression (87, 
92-94). Phase II trials of CCI-779 have been conducted in 
advanced renal cell carcinoma and in stage Ill/TV breast 
carcinoma patients who felled with prior chemotherapy. In 
the results reported In abstract form, although there were no 
complete responses, partial responses were documented in 
both renal cell carcinoma and In breast carcinoma (94, 95). 
Thus, CCI-779 has documented preliminary clinical activity in 
a previously treated, unselected patient population. 

Active Investigation Is under way Into patient selection for 
mTOR Inhibitors. Several studies have found an enhanced 
efficacy of CCI-779 In PTEN-niril tumors (30, 96). Another 
study found that six of eight breast cancer cell lines were 
responsive to CCI-779, although only two of these lines 
lacked PTEN (97) There was, however, a positive correlation 
between Akt activation and CCI-779 sensitivity {97). This 
correlation suggests that activation of the PI3K-Akt pathway, 
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regardless of whether It fs attributable to a PTEN mutation or 
to overexpresston of receptor tyrosine kinases, makes can- 
cer cell amenable to mTOR-directed therapy. In contrast, 
lower levels of the target of mTOR. 4E-BP1 , are associated 
with rapamycln resistance; thus, a lower 4E-BP1/elF4E ratio 
may predict rapamycln resistance {98). 

Another mode of activity for rapamycln and Its analogues 
appears to be through Inhibition of anglogenesis. This activ- 
ity may be both through direct inhibition of endothelial cell 
proliferation as a result of mTOR Inhibition In these cells or by 
Inhibition of translation of such proanglogenlc factors as 
vascular endothelial growth factor in tumor cells {99, 100). 

The anglogenesis Inhibitor tumstatin, another anticancer 
drug currently under study, was also found recently to Inhibit 
translation In endothelial cells (101). Through a requisite In- 
teraction with Integrin, tumstatin inhibits activation of the 
PI3K/Akt pathway and mTOR In endothelial cells and pre- 
vents dissociation of elF4E from 4E-BP1, thereby inhibiting 
cap-dependent translation. These findings suggest that en* 
dothelial cells are especially sensitive to therapies targeting 
the mTOR-slgnaling pathway. 

Targeting elF2cc EPA, Clotrimazole, mda~7, 
and Ravonolds 

EPA Is an n-3 polyunsaturated tatty acid found In the fish- 
based diets of populations having a low incidence of cancer 
(102). EPA inhibits the proliferation of cancer cells (103), as 
well as In animal models (104, 105). it blocks cell division by 
inhibiting translation Initiation (105). EPA releases Ca 2 + from 
Intracellular stores while Inhibiting their refining, thereby ac- 
tivating PKR. PKR, in turn phosphorylates and inhibits elF2o, 
resulting in the Inhibition of protein synthesis at the level of 
translation initiation. Similarly, clotrimazole, a potent antipro- 
liferative agent//) vitro and in vivo, inhibits cell growth through 
depletion of Ca** stores, activation of PKR, and phospho- 
rylation of elF2a (108). Consequently, clotrimazole preferen- 
tially decreases the expression of cycllns A, E, and D1, 
resulting in blockage of the cell cycle In Q 1 . 

mda-7 is a novel tumor suppressor gene being developed 
as a gene therapy agent Adenoviral transfer of mda-7 (Ad- 
mda7) induces apoptosls In many cancer cells including 
breast, colorectal, and lung cancer (1 07-1 09). Ad-mda7 also 
Induces and activates PKR, which leads to phosphorylation 
of elF2a and induction of apoptosls (110). 

Flavonokfs such as genisteln and quercetin suppress tu- 
mor cell growth. All three mammalian elF2a kinases, PKR, 
hams-regulated Inhibitor, and PERK/PEK, are activated by 
flavonoids, with phosphorylation of elF2a and inhibition of 
protein synthesis (111). 

Targeting elF4A and elF4E: Antlsense RNA 
and Peptides 

Antlsense expression of e!F4A decreases the proliferation rate 
of melanoma cells (1 12). Sequestration of elF4E by overexpres- 
ston of 4E-BP1 is proapoptotlc and decreases tumorigenlcity 
(113, 114). Reduction of elF4E with antlsense RNA decreases 
soft agar growth, Increases tumor latency, and increases the 
rates of tumor doubling times (7). Antlsense eiF4E RNA treat- 



ment also reduces the expression of angiogenic facta* (115) 
and has been proposed as a potential ad 
and neck cancers, particularly when elevated eiF4E Is found in 
surgical margins. Small molecule inhibitore that Wnd the elF4G7 
4E-BP1 -binding domain of eIF4E are proapoptotte (116) and 
are also being actively pursued. 



Exploiting Selective Translation for Gene Therapy 
A different therapeutic approach that takes advantage of the 
enhanced cap-dependent translation In cancer cells is the use 
of gene therapy vectors encoding suicide genes with highly 
structured 5' U7R These mRNAwouW thus be at a competitive 
disadvantage in norma! cells and not translate welt whereas In 
cancer cells, they would translate more effk^entiy. For example, 
the Production of the 5' UTR of fibroblast growth fGrtor-2 5' to 
the coding sequence of hemes simplex virus type-1 thymidine 
kinase gene, aflows for selective translation of herpes simplex 
virus type-1 thymidine kinase gene In breast cancer cell fines 
compared with normal mammary ceB lines and results in se- 
lective sensitivity to ganciclovir (1 1 7). 



Toward the Future 

Translation Is a crucial process in every celL However, several 
alterations in translations) oonM 

appear to need an aberrantly activated translations] state for 
survival, thus allowing the targeting of translation Initiation with 
surprisingly low toxicity. Components of the translations] ma- 
chinery, such as elF4E, and signal transduction pathways In- 
vofved In translation initiation, such mTOR, represent 
targets for cancer therapy. Inhibitors of the mTOR have already 
shown some preliminary activity in clinical trials. It is possible 
that with the development of better predictive markers and 
better patient selection, response rates to single-agent therapy 
can be improved. Similar to other cytostatic agents, however, 
mTOR Inhibitors are most Ifcely to achieve clinical utility In 
combination therapy. In the interim, our Increasing understand- 
ing of translation initiation and signal transduction pathways 
promise to lead to the Identification of new therapeutic targets 
In the near future. 
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We have determined the relationship between mRNA and protein expression levels for selected genes 
expressed in the yeast Saccharomyces cerevisiae growing at mid-log phase. The proteins contained in total yeast 
ceil lysate were separated by high-resolution two-dimensional (2D) gel electrophoresis. Over 150 protein spots 
were excised and identified by capillary liquid chromatography-tandem mass spectrometry (LC-MS/MS). 
Protein spots were quantified by metabolic labeling and scintillation counting. Corresponding mRNA levels 
were calculated from serial analysis of gene expression (SAGE) frequency tables (V. £. Velculescu, L. Zhang, 
W. Zhou, J. Vogelstein, M. A. Basrai, D. E. Bassett, Jr., P. Hieter, B. Vogelstein, and K. W. Kinzler, CeU 
88:243-251, 1997). We found that the correlation between mRNA and protein levels was insufficient to predict 
protein expression levels from quantitative mRNA data. Indeed, for some genes, while the mRNA levels were 
of the same value the protein levels varied by more than 20-fold. Conversely, invariant steady-state levels of 
certain proteins were observed with respective mRNA transcript levels that varied by as much as 30-fold. 
Another interesting observation is that codon bias is not a predictor of either protein or mRNA levels. Our 
results clearly delineate the technical boundaries of current approaches for quantitative analysis of protein 
expression and reveal that simple deduction from mRNA transcript analysis is insufficient 



The description of the state of a biological system by the 
quantitative measurement of the system constituents is an es- 
sential but largely unexplored area of biology. With recent 
technical advances including the development of differential 
display-PCR (21), of cDNA microarray and DNA chip tech- 
nology (20, 27), and of serial analysis of gene expression 
(SAGE) (34, 35), it is now feasible to establish global and 
quantitative mRNA expression profiles of cells and tissues in 
species for which the sequence of all the genes is known. 
However, there is emerging evidence which suggests that 
mRNA expression patterns are necessary but are by them- 
selves insufficient for the quantitative description of biological 
systems. This evidence includes discoveries of posttranscrip- 
tional mechanisms controlling the protein translation rate (15), 
the half-lives of specific proteins or mRNAs (33), and the 
intracellular location and molecular association of the protein 
products of expressed genes (32). 

Proteome analysis, defined as the analysis of the protein 
complement expressed by a genome (26), has been suggested 
as an approach to the quantitative description of the state of a 
biological system by the quantitative analysis of protein expres- 
sion profiles (36). Proteome analysis is conceptually attractive 
because of its potential to determine properties of biological 
systems that are not apparent by DNA or mRNA sequence 
analysis alone. Such properties include the quantity of protein 
expression, the subcellular location, the state of modification, 
and the association with ligands, as well as the rate of change 
with time of such properties. In contrast to the genomes of a 
number of microorganisms (for a review, see reference 11) and 
the transcriptome of Saccharomyces cerevisiae (35), which have 
been entirely determined, no proteome map has been com- 
pleted to date. 

The most common implementation of proteome analysis is 
the combination of two-dimensional gel electrophoresis (2DE) 
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(isoelectric focusing-sodium dodecyl sulfate [SDS]-poIyacryl- 
amide gel electrophoresis) for the separation and quantitation 
of proteins with analytical methods for their identification. 
2DE permits the separation, visualization, and quantitation of 
thousands of proteins reproducibly on a single gel (18, 24). By 
itself, 2DE is strictly a descriptive technique. The combination 
of 2DE with protein analytical techniques has added the pos- 
sibility of establishing the identities of separated proteins (1, 2) 
and thus, in combination with quantitative mRNA analysis, of 
correlating quantitative protein and mRNA expression mea- 
surements of selected genes. 

The recent introduction of mass spectrometric protein anal- 
ysis techniques has dramatically enhanced the throughput and 
sensitivity of protein identification to a level which now permits 
the large-scale analysis of proteins separated by 2DR The 
techniques have reached a level of sensitivity that permits the 
identification of essentially any protein that is detectable in the 
gels by conventional protein staining (9, 29). Current protein 
analytical technology is based on the mass spectrometric gen- 
eration of peptide fragment patterns that are idiotypic for the 
sequence of a protein. Protein identity is established by corre- 
lating such fragment patterns with sequence databases (10, 22, 
37). Sophisticated computer software (8) has automated the 
entire process such that proteins are routinely identified with 
no human interpretation of peptide fragment patterns. 

In this study, we have analyzed the mRNA and protein levels 
of a group of genes expressed in exponentially growing cells of 
the yeast S. cerevisiae. Protein expression levels were quantified 
by metabolic labeling of the yeast proteins to a steady state, 
followed by 2DE and liquid scintillation counting of the se- 
lected, separated protein species. Separated proteins were 
identified by in-gel tryptic digestion of spots with subsequent 
analysis by microspray liquid chromatography-tandem mass 
spectrometry (LC-MS/MS) and sequence database searching. 
The corresponding mRNA transcript levels were calculated 
from SAGE frequency tables (35). 

This study, for the first time, explores a quantitative com- 
parison of mRNA transcript and protein expression levels for 
a relatively large number of genes expressed in the same met- 
abolic state. The resultant correlation is insufficient for predic- 
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FIG. 1. Schematic illustration of protcomc analysis by 2DE and mass spectrometry. In part I, proteins arc separated by 2DE, stained spots are excised and subjected 
to in-gcl digestion with trypsin, and the resulting peptides are separated by on-line capillary high-performance liquid chromatography. In part II. a peptide is shown 
cluting from the column in part I. The peptide is ionized by clcctrospray ionization and enters the mass spectrometer. The mass of the ionized peptide is detected, and 
the first quadrupole mass filter allows only the specific mass-to-charge ratio of the selected peptide ion to pass into the collision cell. In the collision cell, the energized, 
ionized peptides collide with neutral argon gas molecules. Fragmentation of the peptide is essentially random but occurs mainly at the peptide bonds, resulting in smaller 
peptides of differing lengths (masses). These peptide fragments arc detected as a tandem mass (MS/MS) spectrum in the third quadrupole mass filter where two ion 
scries arc recorded simultaneously, one each from sequencing inward from the N and C termini of the peptide, respectively. In part III, the MS/MS spectrum from the 
selected, ionized peptide is compared to predicted tandem mass spectra computer generated from a sequence database. Provided thai the peptide sequence exists in 
the database, the peptide and, by association, the protein from which the peptide was derived can be identified. Unambiguous protein identification is attained in a single 
analysis because multiple peptides are identified as being derived from the same protein. 



tion of protein levels from mRNA transcript levels. We have 
also compared the relative amounts of protein and mRNA 
with the respective codon bias values for the corresponding 
genes. This comparison indicates that codon bias by itself is 
insufficient to accurately predict either the mRNA or the pro- 
tein expression levels of a gene. In addition, the results dem- 
onstrate that only highly expressed proteins are detectable by 
2DE separation of total cell lysates and that therefore the 
construction of complete proteome maps with current technol- 
ogy will be very challenging, irrespective of the type of organ- 
ism. 

MATERIALS AND METHODS 

Yeast strain and growth conditions. The source of protein and message tran- 
scripts for alt experiments was YPH499 (AW 7a ura3-52 Iys2-801 adel-IOl 
Ieu2-M his3-&200irp]-M3) (30). Logarithmically growing cells were obtained by 
growing yeast cells to early log phase (3 X 10* cells/ml) in YPD rich medium 
(YPD supplemented with 6 mM uracil, 4.8 mM adenine, and 24 mM tryptophan) 
at 30"C (35). Metabolic labeling of protein was accomplished in YPD medium 



exactly as described elsewhere (4) with the exception that 1 ml of cells was 
labeled with 3 mCi to offset methionine present in YPD medium. Protein was 
harvested as described by Garrcls and coworkers (12). Harvested protein was 
lyophilized, resuspended in isoelectric focusing gel rehydration solution, and 
stored at -80°C 

2DE. Soluble proteins were run in the first dimension by using a commercial 
flatbed electrophoresis system (Multiphor II; Pharmacia Biotech). Immobilized 
polyacrylamidc gel (IPG) dry strips with nonlinear pH 3.0 to 10.0 gradients 
(Amcrsham-Pharrnacia Biotech) were used for the first-dimension separation. 
Forty micrograms of protein from whole-cell lysates was mixed with IPG strip 
rehydration buffer (8 M urea, 2% Nontdet P-40, 10 mM dithiothreitol), and 250 
to 380 pi of solution was added to individual lanes of an IPG strip rehydration 
tray (Amcrsham-Pharrnacia Biotech). The strips were allowed to rchydratc at 
room temperature for 1 h. The samples were run at 300 V-10 mA-5 W for 2 h, 
then ramped to 3,500 V-10 mA-5 W over a period of 3 h, and then kept at 3,500 
V-10 mA-5 W for 15 to 19 h. At the end of the first-dimension run (60 to 70 kV • 
h) t the IPG strips were rccquilibrated for 8 min in 2% (wt/ral) dithiothrcitol in 
2% (wt/vol) SDS-6 M urca-30% (wtftol) glyccrol-0.05 M Tris HO (pH 6\S) and 
for 4 min in 2 j% iodoacctamidc in 2% (wt/ral) SDS-6 M urea-30% (wt/vol) 
glyccrol-0.05 M Tris HCl (pH 6.8). Following reequiiibration, the strips were 
transferred and apposed to 10% polyacrylamidc second-dimension gels. Poly- 
acrylamidc gels were poured in a casting stand with 10% acrylamide-2.67% 
pipcrazinc diacryiamidc-Q375 M Tris base-HQ (pH 8*HU% (wtrvol) SDS-005% 
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FIG. 2. 2D silver-stained gel of the proteins in yeast total cell lysate. Proteins were separated in the first dimension (horizontal) by isoelectric focusing and then in 
the second dimension (vertical) by molecular weight sieving. Protein spots (156) were chosen to include the entire range of molecular weights, isoelectric focusing points, 
and staining intensities. Spots were excised, and the corresponding protein was identified by mass spectrometry and database searching. The spots arc labeled on the 
gel and correspond to the data presented in Table 1. Molecular weights are given in thousands. 



(wtA/ol) ammonium persulfate-O.05% TEMED (AWAWMctramcthylcthyl- 
encdiaminc) in Milii-0 water. The apparatus used to run second-dimension gels 
was a noncommercial apparatus from Oxford Clycoscicnccs, Inc. Once the IPG 
strips were apposed to the second-dimension gels, they were immediately run at 
50 mA (constant)-500 V-85 W for 20 min, followed by 200 roA (constant)-500 
V-85 W until the buffer front line was 10 to 15 mm from the bottom of the gel. 
Gels were removed and silver stained according to the procedure of Shcvchcnfco 
ct ah (29). 

Protein Identification. Gels were exposed to X-ray film overnight, and then the 
silver staining and film were used to excise 156 spots of varying intensities, 
molecular weights, and isoelectric focusing points. In order to increase the 
detection limit by mass spectrometry, spots were cut out and pooled from up to 
four identical cold, sflver*staincd gels. ln-gcl iryptic digests of pooled spots were 
performed as described previously (29). Tryptic peptides were analyzed by mi- 
crocapiUary LC-MS with automated switching (o MS/MS mode for peptide 
fragmentation. Spectra were searched against the composite OWL protein se- 
quence database (version 30.2; 250,514 protein sequences) (24a) by using the 
computer program Sequcst (8), which matches theoretical and acquired tandem 
mass spectra. A protein match was determined by comparing the number of 
peptides identified and their respective cross-correlation scores. Alt protein 
identifications were verified by comparison with theoretical molecular weights 
and isoelectric points. 



mRNA quantitation. Vctcutcscu and coworkers have previously generated 
frequency tables for yeast mRNA transcripts from the same strain grown under 
the same stated conditions as described herein (35). The SAGE technology is 
based on two main principles. First, a short sequence tag (15 bp) that contains 
sufficient information uniquely to identify a transcript is generated. A single tag 
is usually generated from each mRNA transcript in the cell which corresponds to 
15 bp at the 3 '-most cutting site for AftrllT. Second, many transcript tags can be 
concatenated into a single molecule and then sequenced, revealing the identity of 
multiple tags simultaneously. Over 20,000 transcripts were sequenced from yeast 
strain YPH499 growing at mid-log phase on glucose. Assuming the previously 
derived estimate of 15,000 mRNA molecules per cell (16), this would represent 
a 1.3-fold coverage even for mRNA molecules present at a single copy per cell 
and would provide a 72% probability of detecting such transcripts. Computer 
software which took for input the gene detected, examined the nucleotide se- 
quence, and performed the calculation as described by Velculescu and coworkers 
(35) was written. In practice, we found that for 21 of 128 (16%) genes examined 
viable mRNA levels from SAGE data could not be calculated. This was because 
(i) no CATC site was found in the open reading frame (ORF), (ii) a CATC site 
was found but the corresponding 10-bp putative SAGE tag was not found in the 
frequency tables, or (iii) identical putative SAGE tags were present for multiple 
genes (eg., TDH2_YEAST and TDH3.YEAST). 
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TABLE 1. Expressed genes identified from 2D gel in Fig. 2 
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* YPD gene names arc available from the YPO website (39). 

* NA, calculation could not be performed or was not available. 

* mRNA data inconclusive or NA. 

4 No methionines in predicted ORF; therefore, protein concentration was not 
determined. 

' Measured molecular weight or pi did not match theoretical molecular weight 
or pi. 



Protein quantitation. [ 35 S]mcthioninc-labclcd gels were exposed to X-ray film 
overnight, and then the stiver stain and film were used to excise 156 spots of 
varying intensities, molecular weights, and pis. The excised spots were placed in 
0.6-ml microcentrifuge tubes, and scintillation cocktail (100 uJ) was added. The 
samples were vortcxed and counted. In addition, two parallel gels were electro- 
blotted to polyvinylidcne difluoride membranes. The membranes were exposed 
to X-ray film, and four intense single spots were excised from each membrane 
and subjected to amino acid analysis. For these four spots, a mean of 209 ± 4 
cpm/pmol of protcin/methionine was found. This number was used to quantitatc 
all remaining spots in conjunction with the number of methionines present in the 
protein. 

To ensure that proteins were labeled to equilibrium, parallel 2D gels were 
prepared and run on yeast mctabolically labeled for 1, 2, 6, or 18 h. The 
corresponding 156 spots were excised from each gel, and radioactivity was mea- 
sured by liquid scintillation counting for each spot. Calculated protein levels were 
highly reproducible for all time points measured after 1 h. 

Calculation of codon bias and predicted half-life. Codon bias values were 
extracted from the YPD spreadsheet (17). Protein half-lives were calculated 
based on the N-cnd rule (33). When the N-tcrminal processing was not known 
experimentally, it was predicted based on the affinity of methionine aminopep- 
tidasc(31). 

RESULTS 

Characteristics of proteome approach. Nearly every facet of 
proteome analysis hinges on the unambiguous identification of 
large numbers of expressed proteins in cells. Several tech- 
niques have been described previously for the identification of 
proteins separated by 2DE, including N-terminal and internal 
sequencing (1, 2), amino acid analysis (38), and more recently 
mass spectrometry (25). We utilized techniques based on mass 
spectrometry because they afford the highest levels of sensitiv- 
ity and provide unambiguous identification. The specific pro- 
cedure used is schematically illustrated in Fig. 1 and is based 
on three principles. First, proteins are removed from the gel by 



proteolytic in-gel digestion, and the resulting peptides are sep- 
arated by on-line capillary high-performance liquid chromatog- 
raphy. Second, the eluting peptides are ionized and detected, and 
the specific peptide ions are selected and fragmented by the 
mass spectrometer. To achieve this, the mass spectrometer 
switches between the MS mode (for peptide mass identifica- 
tion) and the MS/MS mode (for peptide characterization and 
sequencing). Selected peptides are fragmented by a process 
called collision-induced dissociation (CID) to generate a tan- 
dem mass spectrum (MS/MS spectrum) that contains the pep- 
tide sequence information. Third, individual CID mass spectra 
are then compared by computer algorithms to predicted spec- 
tra from a sequence database. This results in the identification 
of the peptide and, by association, the protein(s) in the spot. 
Unambiguous protein identification is attained in a single anal- 
ysis by the detection of multiple peptides derived from the 
same protein. 

Protein identification. Yeast total cell protein lysate (40 u.g), 
metabolicaily labeled with [^methionine, was electro- 
phoretically separated by isoelectric focusing in the first dimen- 
sion and by SDS-10% polyacrylamide gel electrophoresis in 
the second dimension. Proteins were visualized by silver stain- 
ing and by autoradiography. Of the more than 1,000 proteins 
visible by silver staining, 156 spots were excised from the gel 
and subjected to in-gel tryptic digestion, and the resulting 
peptides were analyzed and identified by microspray LC- 
MS/MS techniques as described above. The proteins in this 
study were all identified automatically by computer software 
with no human interpretation of mass spectra. They are indi- 
cated in Fig. 2 and detailed in Table 1. 

The CID spectra shown in Fig. 3 indicate that the quality of 
the identification data generated was suitable for unambiguous 
protein identification. The spectra represent the amino acid 
sequences of tryptic peptides NSGDIVNLGSIAGR (Fig. 3A) 
and FAVGAFTDSLR (Fig. 3B). Both peptides were derived 
from protein S57593 (hypothetical protein YMR226C), which 
migrated to spot 114 (molecular weight, 29,156; pi, 6.59) in the 
2D gel in Fig. 2. Five other peptides from the same analysis 
were also computer matched to the same protein sequence. 

Protein and mRNA quantitation. For the 156 genes investi- 
gated, the protein expression levels ranged from 2,200 (PGM2) 
to 863,000 (TDH2/TDH3) copies/cell. The levels of mRNA for 
each of the genes identified were calculated from SAGE fre- 
quency tables (35). These tables contain the mRNA levels for 
4,665 genes in yeast strain YPH499 grown to mid-log phase in 
YPD medium on glucose as a carbon source. In some in- 
stances, the mRNA levels could not be calculated for reasons 
stated in Materials and Methods. For the proteins analyzed in 
this study, mean transcript levels varied from 0.7 to 473 copies/ 
cell. 

Selection of the sample population for mRNA-protein ex- 
pression level correlation. The protein spots selected for iden- 
tification were selected from spots visible by silver staining in 
the 2D gel. An attempt was made not to include spots where 
overlap with other spots was readily apparent. The number of 
proteins identified was 156 (Table 1). Some proteins migrated 
to more than one spot (presumably due to differential protein 
processing or modifications), and protein levels from these 
spots were calculated by integrating the intensities of the dif- 
ferent spots. The 156 protein spots analyzed represented the 
products of 128 different genes. Genes were excluded from the 
correlation analysis only if part of the data set was missing; i.e., 
genes were excluded if (i) no mRNA expression data were 
available for the protein or putative SAGE tags were ambig- 
uous, (ii) the amino acid sequence did not contain methionine, 
(iii) more than a single protein was conclusively identified as 
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FIG. 3. Tandem mass (MS/MS) spectra resulting from analysis of a single spot on a 2D gel. The first quadrupolc selected a single mass-to-charge ratio (mfz) of 687.2 
(A) or 592.6 (B), while the collision cell was filled with argon gas, and a voltage which caused the peptide to undergo fragmentation by CID was applied. The third 
quadrupolc scanned the mass range from SO to 1,400 mfz. The computer program Scquest (8) was utilized to match MS/MS spectra to amino acid sequence by database 
searching. Both spectra matched peptides from the same protein, S57593 (yeast hypothetical protein YMR226C). Five other peptides from the same analysis were 
matched to the same protein. 



migrating to the same gel spot, or (iv) the theoretical and 
observed pis and molecular weights could not be reconciled. 
After these criteria were applied, the number of genes used in 
the correlation analysis was 106. 



Codon bias and predicted half-lives. Codon bias is thought 
to be an indicator of protein expression, with highly expressed 
proteins having large codon bias values. The codon bias distri- 
bution for the entire set of more than 6,000 predicted yeast 
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gene ORFs is presented in Fig. 4A. The interval with the 
largest frequency of genes is between the codon bias values of 
0.0 and 0.1. This segment contains more than 2,500 genes. The 
distribution of the codon bias values of the 128 different genes 
found in this study (all protein spots from Fig. 2) is shown in 
Fig. 4B, and protein half-lives (predicted from applying the 
N-end rule [33] to the experimentally determined or predicted 
protein N termini) are shown in Fig. 4C. No genes were iden- 
tified with codon bias values less than 0.1 even though thou- 
sands of genes exist in this category. In addition, nearly all of 
the proteins identified had long predicted half-lives (greater 
than 30 h). 

Correlation of mRNA and protein expression levels. The 
correlation between mRNA and protein levels of the genes 
selected as described above is shown in Fig. 5. For the entire 
group (106 genes) for which a complete data set was gener- 
ated, there was a general trend of increased protein levels 
resulting from increased mRNA levels. The Pearson product 
moment correlation coefficient for the whole data set (106 
genes) was 0.935. This number is highly biased by a small 
number of genes with very large protein and message levels. A 
more representative subset of the data is shown in the inset of 
Fig. 5. It shows genes for which the message level was below 10 
copies/cell and includes 69% (73 of 106 genes) of the data used 
in the study. The Pearson product moment correlation coeffi- 
cient for this data set was only 0.356. We also found that levels 
of protein expression coded for by mRNA with comparable 
abundance varied by as much as 30-fold and that the mRNA 
levels coding for proteins with comparable expression levels 
varied by as much as 20-fold. 

The distortion of the correlation value induced by the un- 
even distribution of the data points along the x axis is further 
demonstrated by the analysis in Fig. 6. The 106 samples in- 
cluded in the study were ranked by protein abundance, and the 
Pearson product moment correlation coefficient was repeat- 
edly calculated after including progressively more, and higher- 
abundance, proteins in each calculation. The correlation values 
remained relatively stable in the range of 0.1 to 0.4 if the 
lowest-expressed 40 to 95 proteins used in this study were 
included. However, the correlation value steadily climbed by 
the inclusion of each of the 11 very highly expressed proteins. 

Correlation of protein and mRNA expression levels with 
codon bias. Codon bias is the propensity for a gene to utilize 
the same codon to encode an amino acid even though other 
codons would insert the identical amino acid in the growing 
polypeptide sequence. It is further thought that highly ex- 
pressed proteins have large codon biases (3). To assess the 
value of codon bias for predicting mRNA and protein levels in 
exponentially growing yeast cells, we plotted the two experi- 
mental sets of data versus the codon bias (Fig. 7). The distri- 
bution patterns for both mRNA and protein levels with respect 
to codon bias were highly similar. There was high variability in 
the data within the codon bias range of 0.8 to 1.0. Although a 
large codon bias generally resulted in higher protein and mes- 
sage expression levels, codon bias did not appear to be predic- 
tive of either protein levels or mRNA levels in the cell. 

> 

DISCUSSION 

The desired end point for the description of a biological 
system is not the analysis of mRNA transcript levels alone but 
also the accurate measurement of protein expression levels and 
their respective activities. Quantitative analysis of global 
mRNA levels currently is a preferred method for the analysis 
of the state of cells and tissues (11). Several methods which 
either provide absolute mRNA abundance (34, 35) or relative 
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rich men t samples mainly highly expressed and long-lived proteins. Genes encod- 
ing highly expressed proteins generally have large codon bias values. (A) Distri- 
bution of the yeast genome (more than 6,000 genes) based on codon bias. The 
interval with the largest frequency of genes is 0.0 to 0.1, with more than 2^00 
genes. (B) Distribution of the genes from identified proteins in this study based 
on codon bias. No genes with codon bias values less than 0.1 were detected in this 
study. (C) Distribution of identified proteins in this study based on predicted 
half-life (estimated by N-cnd rule). 



mRNA levels in comparative analyses (20, 27) have been de- 
scribed elsewhere. The techniques are fast and exquisitely sen- 
sitive and can provide mRNA abundance for potentially any 
expressed gene. Measured mRNA levels are often implicitly or 
explicitly extrapolated to indicate the levels of activity of the 
corresponding protein in the cell. Quantitative analysis of pro- 
tein expression levels (proteome analysis) is much more time- 
consuming because proteins are analyzed sequentially one by 
one and is not general because analyses are limited to the 
relatively highly expressed proteins. Proteome analysis does, 
however, provide types of data that are of critical importance 
for the description of the state of a biological system and that 
are not readily apparent from the sequence and the level of 
expression of the mRNA transcript. This study attempts to 
examine the relationship between mRNA and protein expres- 
sion levels for a large number of expressed genes in cells 
representing the same state. 

Limits in the sensitivity of current protein analysis technol- 
ogy precluded a completely random sampling of yeast proteins. 
We therefore based the study on those proteins visible by silver 
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FIG. 5. Correlation between protein and mRNA levels for 106 genes in yeast growing at log phase with glucose as a carbon source. mRNA and protein levels were 
calculated as described in Materials and Methods. The data represent a population of genes with protein expression levels visible by silver staining on a 2D gel chosen 
to include the entire range of motccular weights, isoelectric focusing points, and staining intensities. The inset shows the low-end portion of the main figure. It contains 
69% of the original data set. The Pearson product moment correlation for the entire data set was 0.935. The correlation for the inset containing 73 proteins (69%) was 
only 0 356. 



staining on a 2D gel. Of the more than 1,000 visible spots, 156 
were chosen to include the entire range of molecular weights, 
isoelectric focusing points, and staining intensities displayed on 
the 2D protein pattern. The genes identified in this study 
shared a number of properties. First, all of the proteins in this 
study had a codon bias of greater than 0.1 and 93% were 
greater than 0.2 (Fig. 4B). Second, with few exceptions, the 
proteins in this study had long predicted half-lives according to 
the N-end rule (Fig. 4C). Third, low-abundance proteins with 
regulatory functions such as transcription factors or protein 
kinases were not identified. 

Because the population of proteins used in this study ap- 
pears to be fairly homogeneous with respect to predicted half- 
life and codon bias, it might be expected that the correlation of 
the mRNA and protein expression levels would be stronger for 
this population than for a random sample of yeast proteins. We 
tested this assumption by evaluating the correlation value if 
different subsets of the available data were included in the 
calculation. The 106 proteins were ranked from lowest to high- 
est protein expression level, and the trend in the correlation 
value was evaluated by progressively including more of the 
higher-abundance proteins in the calculation (Fig. 6). The cor- 
relation value when only the lower-abundance 40 to 93 pro- 
teins were examined was consistently between 0.1 and 0.4. If 
the 11 most abundant proteins were included, the correlation 
steadily increased to 0.94. We therefore expect that the corre- 
lation for all yeast proteins or for a random selection would be 
less than 0.4. The observed level of correlation between 
mRNA and protein expression levels suggests the importance 



of posttranslational mechanisms controlling gene expression. 
Such mechanisms include translational control (15) and con- 
trol of protein half-life (33). Since these mechanisms are also 
active in higher eukaryotic cells, we speculate that there is no 
predictive correlation between steady-state levels of mRNA 
and those of protein in mammalian cells. 

Like other large-scale analyses, the present study has several 
potential sources of error related to the methods used to de- 
termine mRNA and protein expression levels. The mRNA 
levels were calculated from frequency tables of SAGE data. 
This method is highly quantitative because it is based on actual 
sequencing of unique tags from each gene, and the number of 
times that a tag is represented is proportional to the number of 
mRNA molecules for a specific gene. This method has some 
limitations including the following: (i) the magnitude of the 
error in the measurement of mRNA levels is inversely propor- 
tional to the mRNA levels, (ii) SAGE tags from highly similar 
genes may not be distinguished and therefore are summed, (iii) 
some SAGE tags are from sequences in the 3' untranslated 
region of the transcript, (iv) incomplete cleavage at the SAGE 
tag site by the restriction enzyme can result in two tags repre- 
senting one mRNA, and (v) some transcripts actually do not 
generate a SAGE tag (34, 35). 

For the SAGE method, the error associated with a value 
increases with a decreasing number of transcripts per cell. The 
conclusions drawn from this study are dependent on the qual- 
ity of the mRNA levels from previously published data (35). 
Since more than 65% of the mRNA levels included in this 
study were calculated to 10 copies/cell or less (40% were less 
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FIG. 6. Effect of highly abundant proteins on Pearson produet moment correlation coefficient for mRNA and protein abundance in yeast. The set of 106 genes was 
ranked according to protein abundance, and the correlation value was calculated by including the 40 lowest-abundance genes and then progressively including the 
remaining 66 genes in order of abundance. The correlation value climbs as the final 11 highly abundant proteins arc included. 



than 4 copies/cell), the error associated with these values may 
be quite large. The mRNA levels were calculated from more 
than 20,000 transcripts. Assuming that the estimate of 15,000 
raRNA molecules per cell is correct (16), this would mean that 
mRNA transcripts present at only a single copy per cell would 
be detected 72% of the time (35). The mRNA levels for each 
gene were carefully scrutinized, and only mRNA levels for 
which a high degree of confidence existed were included in the 
correlation value. 

Protein abundance was determined by metabolic radiolabel- 
ing with [ 35 S]methionine. The calculation required knowledge 
of three variables: the number of methionines in the mature 
protein, the radioactivity contained in the protein, and the 
specific activity of the radiolabel normalized per methionine. 
The number of methionines per protein was determined from 
the amino acid sequence of the proteins identified by tandem 
mass spectrometry. For some proteins, it was not known 
whether the methionine of the nascent polypeptide was pro- 
cessed away. The N termini of those proteins were predicted 
based on the specificity of methionine aminopeptidase (31). If 
the N-termina! processing did not conform to the predicted 
specificity of processing enzymes, the calculation of the num- 
ber of methionines would be affected. This discrepancy would 
affect most the quantitation of a protein with a very low num- 
ber of methionines. The average number of calculated methi- 
onines per protein in this study was 7.2. We therefore expect 
the potential for erroneous protein quantitation due to un- 
usual N-terminal processing to be small. 



The amount of radioactivity contained in a single spot might 
be the sum of the radioactivity of comigrating proteins. Be- 
cause protein identification was based on tandem mass spec- 
trometric techniques, comigrating proteins could be identified. 
However, comigrating proteins were rarely detected in this 
study, most likely because relatively small amounts of total 
protein (40 p,g) were initially loaded onto the gels, which re- 
sulted in highly focused spots containing generally 1 to 25 ng of 
protein. Because of the relatively small amount loaded, the 
concentrations of any potentially comigrating protein would 
likely be below the limit of detection of the mass spectrometry 
technique used in this study (1 to 5 ng) and below the limit of 
visualization by silver staining (1 to 5 ng). In the overwhelming 
majority of the samples analyzed, numerous peptides from a 
single protein were detected. It is assumed that any comigrat- 
ing proteins were at levels too low to be detected and that their 
influence in the calculation would be small. 

The specific activity of the radiolabel was determined by 
relating the precise amount of protein present in selected spots 
of a parallel gel, as determined by quantitative amino acid 
composition analysis, to the number of methionines present in 
the sequence of those proteins and the radioactivity deter- 
mined by liquid scintillation counting. It is possible that the 
resulting number might be influenced by unavoidable losses 
inherent in the amino acid analysis procedure applied. Because 
four different proteins were utilized in the calculation and the 
experiment was done in duplicate, the specific activity calcu- 
lated is thought to be highly accurate. Indeed, the specific 
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FIG. 7. Relationship between codon bias and protein and mRNA levels in this study. Yeast mRNA and protein expression levels were calculated as described in 
Materials and Methods. The data represent the same 106 genes as in Fig. 5. 



activities calculated for each of the four proteins varied by less 
than 10%. Any inconsistencies in the calculation of the specific 
activity would result in differences in the absolute levels calcu- 
lated but not in the relative numbers and would therefore not 
influence the correlation value determined. 

The protein quantitative method used eliminates a number 
of potential errors inherent in previous methods for the quan- 
titation of proteins separated by 2DE, such as preferential 
protein staining and bias caused by inequalities in the number 
of radiolabeled residues per protein. Any 2D gel-based method 
of quantitation is complicated by the fact that in some cases the 
translation products of the same mRNA migrated to different 
spots. One major reason is postradiational modification or 
processing of the protein. Also, artifactual proteolysis during 
cell lysis and sample preparation can lead to multiple resolved 
forms of the protein. In such cases, the protein levels of spots 
coded for by the same mRNA were pooled. In addition, the 
existence of other spots coded for by the same mRNA that 
were not analyzed by mass spectrometry or that were below the 
limit of detection for silver staining cannot be ruled out. How- 
ever, since this study is based on a class of highly expressed 
proteins, the presence of undetected minor spots below silver 
staining sensitivity corresponding to a protein analyzed in the 
study would generally cause a relatively small error in protein 
quantitation. 

Codon bias is a measure of the propensity of an organism to 
selectively utilize certain codons which result in the incorpo- 
ration of the same amino acid residue in a growing polypeptide 
chain. There are 61 possible codons that code for 20 amino 
acids. The larger the codon bias value, the smaller the number 
of codons that are used to encode the protein (19). It is 



thought that codon bias is a measure of protein abundance 
because highly expressed proteins generally have large codon 
bias values (3, 13). 

Nearly all of the most highly expressed proteins had codon 
bias values of greater than 0.8. However, we detected a number 
of genes with high codon bias and relative low protein abun- 
dance (Fig. 7). For example, the expressed gene with both the 
second largest protein and mRNA levels in the study was 
EN02_YEAST (775,000 and 289.1 copies/cell, respectively). 
ENO!_YEAST was also present in the gel at much lower 
protein and mRNA levels (44,200 and 0.7 copies/cell, respec- 
tively). The codon bias values for EN02 and ENOl are similar 
(0.96 and 0.93, respectively), but the expression of the two 
genes is differentially regulated. Specifically, EN01_YEAST is 
glucose repressed (6) and was therefore present in low abun- 
dance under the conditions used. Other genes with large codon 
bias values that were not of high protein abundance in the gel 
include EFT1, TIF1, HXK2, GSP1, EGD2, SHM2, and TALI. 
We conclude that merely determining the codon bias of a gene 
is not sufficient to predict its protein expression level. 

Interestingly, codon bias appears to be an excellent indicator 
of the boundaries of current 2D gel proteome analysis tech- 
nology. There are thousands of genes with expressed mRNA 
and likely expressed protein with codon bias values less than 
0.1 (Fig. 4 A). In this study, we detected none of them, and only 
a very small percentage of the genes detected in this study had 
codon bias values between 0.1 and 0.2 (Fig. 4B). Indeed, in 
every examined yeast proteome study (5, 7, 13, 28) where the 
combined total number of identified proteins is 300 to 400, this 
same observation is true. It is expected that for the more 
complex cells of higher eukaryotic organisms the detection of 
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low-abundance proteins would be even more challenging than 
for yeast. This indicates that highly abundant, long-lived pro- 
teins are overwhelmingly detected in proteome studies. If pro- 
teome analysis is to provide truly meaningful information 
about cellular processes, it must be able to penetrate to the 
level of regulatory proteins, including transcription factors and 
protein kinases. A promising approach is the use of narrow- 
range focusing gels with immobilized pH gradients (IPG) (23). 
This would allow for the loading of significantly more protein 
per pH unit covered and also provide increased resolution of 
proteins with similar electrophoretic mobilities. A standard pH 
gradient in an isoelectric focusing gel covers a 7-pH-unit range 
(pH 3 to 10) over 18 cm. A narrow-range focusing gel might 
expand the range to 0.5 pH units over 18 cm or more. This 
could potentially increase by more than 10-fold the number of 
proteins that can be detected. Clearly, current proteome tech- 
nology is incapable of analyzing low-abundance regulatory pro- 
teins without employing an enrichment method for relatively 
low-abundance proteins. In conclusion, this study examined 
the relationship between yeast protein and message levels and 
revealed that transcript levels provide little predictive value 
with respect to the extent of protein expression. 
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